API Reference
Resources Types
KasprApp
A program that runs components of a distributed stream processing application.
| Field | Type | Default | Required |
|---|---|---|---|
| apiVersion | string | kaspr.io/v1alpha1 | Yes |
| kind | string | KasprApp | Yes |
| metadata | ObjectMeta | {} | Yes |
Metadata that all persisted resources must have. | |||
KasprAppSpec
Specification of the desired settings of the application.
| Field | Type | Default | Required |
|---|---|---|---|
| version | string | No | |
The kaspr version. Defaults to the latest. | |||
| replicas | integer | 1 | No |
The number of desired instances. | |||
| image | string | No | |
The container image to use. Defaults to the corresponding image of the version configuration. | |||
| bootstrapServers | string | Yes | |
Bootstrap server names to connect to. This should be given as a comma separated list of <hostname>:<port> pairs. | |||
| tls | TLS | No | |
TLS configuration. Provide an empty entry | |||
| authentication | KafkaAuthentication | No | |
Authentication configuration for Kafka. | |||
| resources | ResourceRequirements | No | |
Compute Resources required by each instance on the application. | |||
| storage | StorageRequirements | Yes | |
Disk storage required by each instance of the application. | |||
| pythonPackages | PythonPackagesSpec | No | |
Python package management configuration for automatic package installation via init containers. | |||
KasprAppConfig
KasprApp configuration parameters.
| Field | Type | Default | Required |
|---|---|---|---|
| topicReplicationFactor | integer | 3 | No |
The default replication factor for topics created by the application. | |||
| topicPartitions | integer | 3 | No |
Default number of partitions for topics created by the application. | |||
| topicAllowDeclare | boolean | false | No |
This setting controls whether or not creation of topics is allowed. | |||
| schedulerEnabled | boolean | false | No |
This setting controls whether or not the message scheduler is enabled. | |||
| schedulerDebugStatsEnabled | boolean | false | No |
This setting controls whether or not scheduler debug statistics are printed to log. | |||
| schedulerTopicPartitions | integer | No | |
Default number of partitions for internal scheduler related topics. Defaults to general topic partition configuration if not set. | |||
| schedulerCheckpointSaveIntervalSeconds | number:float | 1.3 | No |
How often we save checkpoint to storage (and to changelog topic) | |||
| schedulerDispatcherDefaultCheckpointLookbackDays | integer | 7 | No |
Number of days the dispatcher will look back from current date to seek starting a point for dispatching messages when an checkpoint is not found. This is mostly used during initial app deployment. | |||
| schedulerDispatcherCheckpointInterval | number:float | 10.0 | No |
How often we checkpoint the dispacher's location in the timetable (in seconds). | |||
| schedulerJanitorCheckpointInterval | number:float | 10.0 | No |
How often we checkpoint the janitor's location in the timetable (in seconds). | |||
| schedulerJanitorCleanIntervalSeconds | number:float | 3.0 | No |
How often the janitor attempts to clean. | |||
| schedulerJanitorHighwaterOffsetSeconds | number:float | 14400.0 | No |
Number of seconds the janitor trails current from the highwater timetable location. | |||
| storeRocksdbWriteBufferSize | integer | 67108864 | No |
This is the maximum write buffer size. It represents the amount of data to build up in memory before converting to a sorted on-disk file. The default is 64 MB. | |||
| storeRocksdbMaxWriteBufferNumber | integer | 3 | No |
Maximum number of write buffers (memtables) that can be built in memory at the same time. | |||
| storeRocksdbTargetFileSizeBase | integer | 67108864 | No |
Target size for files at level-1 in the LSM tree. Used to determine the size of the SST (Sorted String Table) files that RocksDB generates during compactions. The default is 64 MB. | |||
| storeRocksdbBlockCacheSize | integer | No | |
Size for caching uncompressed data. Defauls to about 30% of application's total memory budget | |||
| storeRocksdbBlockCacheCompressedSize | integer | 268435456 | No |
Size for caching compressed data. Defaults to 254MB. | |||
| storeRocksdbBloomFilterSize | integer | 3 | No |
A Bloom filter in RocksDB is used to quickly check whether a key might be in an SST (Sorted String Table) file without actually reading the file, which can significantly improve read performance. | |||
| storeRocksdbSetCacheIndexAndFilterBlocks | boolean | 3 | No |
If set to true, index and filter blocks will be stored in block cache, together with all other data blocks. | |||
| webBasePath | string | No | |
Base HTTP path for serving web requests. | |||
| webPort | integer | 6065 | No |
Port number between 1024 and 65535 to use for the web server. | |||
KasprAgent
A distributed system processing events in a stream.
| Field | Type | Default | Required |
|---|---|---|---|
| apiVersion | string | kaspr.io/v1alpha1 | Yes |
| kind | string | KasprAgent | Yes |
| metadata | ObjectMeta | {} | Yes |
Metadata that all persisted resources must have. An agent must declare a label | |||
| spec | KasprAgentSpec | {} | Yes |
Specification of the desired behavior of the KasprAgent. | |||
KasprAgentSpec
Specification of the desired behavior of the KasprAgent.
| Field | Type | Default | Required |
|---|---|---|---|
| description | string | No | |
A short description of the agent for documentation purpose. | |||
| input | KasprAgentInput | {} | Yes |
Input configuration for the agent. | |||
| output | KasprAgentOutput | {} | No |
Output configuration for the agent. Processed events are sent out to the optional output. | |||
| processors | KasprAgentProcessors | {} | No |
Sequence of operations that handle input events. Processors are optional and can be used to modify or act on the input events before they are sent to the output. | |||
KasprAgentInput
Input configuration for the KasprAgent.
| Field | Type | Default | Required |
|---|---|---|---|
| declare | boolean | false | No |
If true, the agent will create the input topic/channel if it does not exist. | |||
| take | KasprAgentInputBuffer | {} | No |
Buffering configuration. This allows the agent to collect multiple events before processing them. | |||
| topic | KasprAgentInputTopic | {} | No |
Input topic configuration; mutually exclusive with | |||
| channel | KasprAgentInputChannel | {} | No |
Input channel configuration; mutually exclusive with | |||
KasprAgentInputTopic
Input topic configuration for the KasprAgent.
| Field | Type | Default | Required |
|---|---|---|---|
| name | string | No | |
Topic names to subscribe to. This can be a single topic or a comma-separated list of topics. Required, but mutually exclusive with | |||
| pattern | string | No | |
Regex pattern of topic names to subscribe to. Required, but mutually exclusive with | |||
| keySerializer | string | json | No |
Serializer for the key portion of the kafka message. Must be one of | |||
| valueSerializer | string | json | No |
Serializer for the value portion of the kafka message. Must be one of | |||
KasprAgentInputChannel
Input channel configuration for the KasprAgent.
| Field | Type | Default | Required |
|---|---|---|---|
| name | string | Yes | |
Name of the channel to subscribe to | |||
KasprAgentInputBuffer
Buffering configuration for the input. This allows the agent to collect multiple events before processing them.
| Field | Type | Default | Required |
|---|---|---|---|
| max | integer | Yes | |
Maximum number of events to buffer before processing them. | |||
| within | string | Yes | |
Time window for buffering events. Events will be processed after this duration, even if the max limit has not been reached. Examples: | |||
KasprAgentOutput
Output configuration for the KasprAgent.
| Field | Type | Default | Required |
|---|---|---|---|
| topics | KasprAgentOutputTopic[] | [] | No |
List of topics to publish to. | |||
KasprAgentOutputTopic
Output topic configuration for the KasprAgent.
| Field | Type | Default | Required |
|---|---|---|---|
| name | string | Yes | |
Topic name | |||
| ack | boolean | false | No |
Wait for broker acknoledgement before considering the message sent. | |||
| keySerializer | string | json | No |
Serializer for the key portion of the kafka message. Must be one of | |||
| valueSerializer | string | json | No |
Serializer for the value portion of the kafka message. Must be one of | |||
| keySelector | Code | No | |
Custom code to be used to select the key of the message. The code must define a function that takes a single argument and returns a value. | |||
| valueSelector | Code | No | |
Custom code to be used to select the value of the message. The code must define a function that takes a single argument and returns a value. | |||
| partitionSelector | Code | No | |
Custom code to be used to select the partition number of the message. The code must define a function that takes a single argument and returns an integer. | |||
| headersSelector | Code | No | |
Custom code to be used to select the headers of the message. The code must define a function that takes a single argument and returns a key and value map. | |||
| predicate | Code | No | |
Custom code to be used to select the messages to be sent. The code must define a function that takes a single argument and returns a boolean value. Predicates that return | |||
Code
Injectable custom code that is used achieve a desired outcome in a specific context.
| Field | Type | Default | Required |
|---|---|---|---|
| entrypoint | string | No | |
Name of the function that is invoked. If code defines multiple functions, the first function is used as the entrypoint. To avoid ambiguity, it is recommended to explicilty specify the entrypoint. | |||
| python | string | Yes | |
Python code to be executed. The code must define a valid Python function to use as an entrypoint. | |||
KasprAgentProcessors
Chain of processors that modify input values.
| Field | Type | Default | Required |
|---|---|---|---|
| pipeline | list | [] | Yes |
Sequence of operations to run. | |||
| init | object | {} | No |
Initialization code to run. | |||
| operations | list | [] | Yes |
Chain of operations to run on input events. | |||
KasprAgentProcessorOperation
A transform or action to take on input events.
| Field | Type | Default | Required |
|---|---|---|---|
| name | string | Yes | |
A unique name for the operation. This is used to identify the operation in the pipeline. | |||
| tables | KasprAgentProcessorOperationTable[] | No | |
List of tables which are made available in the operation's function. | |||
| map | Code | No | |
Takes an input value and returns an output value which is passed down to the next operation or the agent's output . | |||
| filter | Code | No | |
Takes an input value and determines if the event is eligible to be passed down to the next operation or the agent's output. If the filter function returns | |||
KasprAgentProcessorOperationTable
Reference to a KasprTable and how it is made available to an operation’s function.
| Field | Type | Default | Required |
|---|---|---|---|
| name | string | Yes | |
The name of the | |||
| paramName | string | Yes | |
The name of the parameter in which the table is made available in the operation's function. | |||
KasprJoin
A reactive key join between two KasprTable resources. Emits combined results to a named output channel when either table changes.
| Field | Type | Default | Required |
|---|---|---|---|
| apiVersion | string | kaspr.io/v1alpha1 | Yes |
| kind | string | KasprJoin | Yes |
| metadata | ObjectMeta | {} | Yes |
Metadata that all persisted resources must have. A join must declare a label | |||
| spec | KasprJoinSpec | {} | Yes |
Specification of the desired behavior of the KasprJoin. | |||
KasprJoinSpec
Specification of a key join between two tables.
| Field | Type | Default | Required |
|---|---|---|---|
| description | string | No | |
A short description of the join for documentation purposes. | |||
| leftTable | string | Yes | |
Name of the left-side KasprTable resource. This is the "driving" table whose changes trigger foreign key extraction. | |||
| rightTable | string | Yes | |
Name of the right-side KasprTable resource (the "lookup" table). Must belong to the same KasprApp. | |||
| extractor | Code | {} | Yes |
Python function that extracts the join key from left-table values. The returned value is used to look up the corresponding record in the right table. | |||
| type | string | inner | No |
Join semantics. | |||
| outputChannel | string | {name}-channel | No |
Name of the output channel for joined results. KasprAgents reference this via | |||
KasprJoinStatus
Status of a KasprJoin resource, populated by the operator.
| Field | Type | Default | Required |
|---|---|---|---|
| app | object | {} | |
Status of the referenced KasprApp. Contains | |||
| leftTable | object | {} | |
Status of the referenced left KasprTable. Contains | |||
| rightTable | object | {} | |
Status of the referenced right KasprTable. Contains | |||
| configMap | string | ||
Name of the ConfigMap containing the serialized join spec. | |||
| hash | string | ||
Hash of the current join configuration. | |||
KasprTable
A table is a distributed key-value store backed by a Kafka changelog topic, used for stateful stream processing within a KasprApp.
| Field | Type | Default | Required |
|---|---|---|---|
| apiVersion | string | kaspr.io/v1alpha1 | Yes |
| kind | string | KasprTable | Yes |
| metadata | ObjectMeta | {} | Yes |
Metadata that all persisted resources must have. | |||
| spec | KasprTableSpec | {} | Yes |
Specification of the desired table configuration. | |||
KasprTableSpec
Specification of a KasprTable resource.
| Field | Type | Default | Required |
|---|---|---|---|
| name | string | Yes | |
The name of the table. | |||
| description | string | No | |
A short description of the table. | |||
| global | boolean | false | No |
If true, the table is global. Global tables have a full copy of the data on every node. | |||
| defaultSelector | Code | No | |
Python code that returns a default value when a key is not found in the table. | |||
| keySerializer | string | json | No |
Serializer for the table key. Must be | |||
| valueSerializer | string | json | No |
Serializer for the table value. Must be | |||
| partitions | integer | No | |
Number of partitions in the table's changelog topic. | |||
| extraTopicConfigs | object | {} | No |
Additional Kafka topic configurations applied when creating the table's changelog topic (e.g., | |||
| options | object | {} | No |
Advanced table configuration options. | |||
| window | KasprTableWindow | No | |
Window configuration for the table. Enables tumbling or hopping windowed aggregation. | |||
KasprTableWindow
Window configuration for a table. Exactly one of tumbling or hopping should be specified.
| Field | Type | Default | Required |
|---|---|---|---|
| tumbling | KasprTableWindowTumbling | No | |
Tumbling window configuration. Windows are fixed-size, non-overlapping time intervals. | |||
| hopping | KasprTableWindowHopping | No | |
Hopping window configuration. Windows are fixed-size but advance by a step interval, so they may overlap. | |||
| relativeTo | string | stream | No |
How events are assigned to windows. Must be | |||
| relativeToSelector | Code | No | |
Python function that extracts a timestamp from the event for window assignment. Only used when | |||
KasprTableWindowTumbling
Tumbling window configuration. Windows are fixed-size, non-overlapping time intervals.
| Field | Type | Default | Required |
|---|---|---|---|
| size | string | Yes | |
The window duration (e.g., | |||
| expires | string | No | |
How long to retain data for each window after it closes (e.g., | |||
KasprTableWindowHopping
Hopping window configuration. Windows are fixed-size and advance by a step interval.
| Field | Type | Default | Required |
|---|---|---|---|
| size | string | Yes | |
The window duration (e.g., | |||
| step | string | Yes | |
The interval at which new windows are created (e.g., | |||
| expires | string | No | |
How long to retain data for each window after it closes (e.g., | |||
KasprWebView
A web view provides an HTTP endpoint to a KasprApp, enabling remote access to the application.
| Field | Type | Default | Required |
|---|---|---|---|
| name | string | Yes | |
Name of the web view interface | |||
| description | string | No | |
A short description of the web view for documentation purpose. | |||
| request | KasprWebViewRequest | Yes | |
Describes the request endpoint listener. | |||
| response | KasprWebViewResponse | No | |
Describes the web view's response output. | |||
KasprWebViewRequest
Specification for the web view request endpoint.
| Field | Type | Default | Required |
|---|---|---|---|
| method | string | Yes | |
HTTP method to use for the request. Must be one of | |||
| path | string | Yes | |
Web path to listen on. | |||
KasprWebViewResponse
Specification for the web view’s response output.
| Field | Type | Default | Required |
|---|---|---|---|
| contentType | string | No | |
Response content type. Must be one of | |||
| statusCode | integer | No | |
HTTP status code to return on success. | |||
| headers | object | No | |
Headers as key and value pairs to return in the response | |||
| bodySelector | KasprWebViewResultHandler | No | |
Custom function to transform the response body on success or error. | |||
| statusCodeSelector | KasprWebViewResultHandler | No | |
Custom function to determine the status code that is returned in response. | |||
| headersSelector | KasprWebViewResultHandler | No | |
Custom function to determine the headers returned in response. | |||
KasprWebViewResultHandler
Function handlers for success or error.
| Field | Type | Default | Required |
|---|---|---|---|
| onSuccess | Code | No | |
Function to transform response on success. | |||
| onError | Code | No | |
Function to transform response on error. | |||
KafkaAuthentication
Kafka authentication configuration.
| Field | Type | Default | Required |
|---|---|---|---|
| type | string | No | |
The type of authentication to use. Must be one of | |||
| username | string | No | |
Username used for authentication. | |||
| passwordSecret | PasswordSecret | No | |
Details of the kubernetes secret where the authentication password is stored. | |||
PasswordSecret
Kubernetes secret.
| Field | Type | Default | Required |
|---|---|---|---|
| secretName | string | Yes | |
The name of the Kubernetes secret resource containting the password. | |||
| passwordKey | string | Yes | |
The name of the key in the Secret under which the password is stored. | |||
StorageRequirements
Disk storage configuration (disk).
| Field | Type | Default | Required |
|---|---|---|---|
| type | string | Yes | |
Storage type, must be | |||
| class | string | Yes | |
The storage class to use for dynamic volume allocation. | |||
| deleteClaim | boolean | false | No |
Specifies if the persistent volume claim has to be deleted when the app is undeployed. | |||
PythonPackagesSpec
Configuration for automatic Python package installation. Packages are installed by an init container before the main application starts.
| Field | Type | Default | Required |
|---|---|---|---|
| packages | string[] | Yes | |
List of Python packages to install (e.g., | |||
| indexUrl | string | No | |
Base URL of the Python Package Index (e.g., | |||
| extraIndexUrls | string[] | No | |
Additional package index URLs to search. | |||
| trustedHosts | string[] | No | |
Hosts to trust for SSL verification (skip certificate checks). | |||
| credentials | PythonPackagesCredentials | No | |
Credentials for private PyPI registry authentication. | |||
| cache | PythonPackagesCache | No | |
Cache configuration for storing installed packages across restarts. | |||
| installPolicy | PythonPackagesInstallPolicy | No | |
Installation behavior policy (retries, timeout, failure handling). | |||
| resources | ResourceRequirements | No | |
Resource requirements for the init container that installs packages. | |||
PythonPackagesCache
Cache configuration for storing installed packages. Supports PVC-based caching, ephemeral (emptyDir) storage, or Google Cloud Storage (GCS).
| Field | Type | Default | Required |
|---|---|---|---|
| type | string | pvc | No |
Cache backend type. Must be | |||
| enabled | boolean | false | No |
Enable PVC-based caching. When | |||
| storageClass | string | No | |
The storage class to use for the packages cache PVC. | |||
| size | string | No | |
Size of the packages cache PVC (e.g., | |||
| accessMode | string | ReadWriteMany | No |
Access mode for the cache PVC. Only | |||
| deleteClaim | boolean | false | No |
Whether to delete the PVC when the app is undeployed. | |||
| gcs | GCSCacheConfig | No | |
GCS cache configuration. Required when | |||
GCSCacheConfig
Google Cloud Storage cache configuration for storing packaged archives remotely.
| Field | Type | Default | Required |
|---|---|---|---|
| bucket | string | Yes | |
GCS bucket name for storing cached package archives. | |||
| prefix | string | kaspr-packages/ | No |
Key prefix for cached archives in the bucket. | |||
| maxArchiveSize | string | 1Gi | No |
Maximum archive size to upload to GCS. Archives exceeding this size are skipped. | |||
| secretRef | GCSSecretReference | Yes | |
Reference to a Secret containing a GCS service account key JSON file. | |||
GCSSecretReference
Reference to a Kubernetes Secret containing a Google Cloud service account key JSON file for GCS authentication.
| Field | Type | Default | Required |
|---|---|---|---|
| name | string | Yes | |
Name of the Secret containing the service account key. | |||
| key | string | sa.json | No |
Key within the Secret containing the JSON file. | |||
PythonPackagesInstallPolicy
Installation behavior policy for Python package installation.
| Field | Type | Default | Required |
|---|---|---|---|
| retries | integer | 3 | No |
Number of retry attempts for package installation. | |||
| timeout | integer | 600 | No |
Timeout in seconds for package installation. Minimum value is 60. | |||
| onFailure | string | block | No |
Behavior on installation failure. | |||
PythonPackagesCredentials
Credentials for private PyPI registry authentication.
| Field | Type | Default | Required |
|---|---|---|---|
| secretRef | SecretReference | Yes | |
Reference to a Secret containing PyPI registry credentials. | |||
SecretReference
Reference to a Kubernetes Secret containing registry credentials.
| Field | Type | Default | Required |
|---|---|---|---|
| name | string | Yes | |
Name of the Secret containing credentials. | |||
| usernameKey | string | username | No |
Key in the Secret for the username. | |||
| passwordKey | string | password | No |
Key in the Secret for the password. | |||