# si_tests.clients.gcore.box_api.InferencesApi

All URIs are relative to *http://localhost*

Method | HTTP request | Description
------------- | ------------- | -------------
[**v1_create_inference**](InferencesApi.md#v1_create_inference) | **POST** /v1/{project_name}/inferences | Create inference deployment
[**v1_delete_inference**](InferencesApi.md#v1_delete_inference) | **DELETE** /v1/{project_name}/inferences/{inference_name} | Delete inference deployment
[**v1_get_inference**](InferencesApi.md#v1_get_inference) | **GET** /v1/{project_name}/inferences/{inference_name} | Get inference deployment
[**v1_get_inference_api_key**](InferencesApi.md#v1_get_inference_api_key) | **GET** /v1/{project_name}/inferences/{inference_name}/apikey | Get inference API key
[**v1_get_inference_logs**](InferencesApi.md#v1_get_inference_logs) | **GET** /v1/{project_name}/inferences/{inference_name}/logs | Get inference logs
[**v1_list_inferences**](InferencesApi.md#v1_list_inferences) | **GET** /v1/{project_name}/inferences | List inference deployments
[**v1_update_inference**](InferencesApi.md#v1_update_inference) | **PUT** /v1/{project_name}/inferences/{inference_name} | Update inference deployment


# **v1_create_inference**
> V1InferenceResponse v1_create_inference(project_name, v1_create_inference_request, dry_run=dry_run)

Create inference deployment

This endpoint allows you to deploy a standalone containerized inference service with specific configuration
parameters such as the container image, resource requirements, scaling options, and networking settings.
The deployment can be created across multiple regions for high availability.

Inference deployments are containerized services that run machine learning models or related components.
They can be created directly using this endpoint or as part of an application deployment from the apps catalog.

Use this endpoint when you need to:
- Deploy a single machine learning model or service
- Create a custom inference deployment with specific configuration
- Deploy an inference that is not available in the apps catalog

Note: This endpoint creates standalone inference deployments that you can manage directly.
If you need to deploy a pre-configured application with multiple components (e.g., a model API
and a UI), consider using the `/v1/{project_name}/apps/deployments` endpoints instead.

### Example


```python
import si_tests.clients.gcore.box_api
from si_tests.clients.gcore.box_api.models.v1_create_inference_request import V1CreateInferenceRequest
from si_tests.clients.gcore.box_api.models.v1_inference_response import V1InferenceResponse
from si_tests.clients.gcore.box_api.rest import ApiException
from pprint import pprint

# Defining the host is optional and defaults to http://localhost
# See configuration.py for a list of all supported configuration parameters.
configuration = si_tests.clients.gcore.box_api.Configuration(
    host = "http://localhost"
)


# Enter a context with an instance of the API client
with si_tests.clients.gcore.box_api.ApiClient(configuration) as api_client:
    # Create an instance of the API class
    api_instance = si_tests.clients.gcore.box_api.InferencesApi(api_client)
    project_name = 'project_name_example' # str | Project name
    v1_create_inference_request = si_tests.clients.gcore.box_api.V1CreateInferenceRequest() # V1CreateInferenceRequest | Inference deployment configuration
    dry_run = True # bool | Perform validation but do not apply any changes (optional)

    try:
        # Create inference deployment
        api_response = api_instance.v1_create_inference(project_name, v1_create_inference_request, dry_run=dry_run)
        print("The response of InferencesApi->v1_create_inference:\n")
        pprint(api_response)
    except Exception as e:
        print("Exception when calling InferencesApi->v1_create_inference: %s\n" % e)
```



### Parameters


Name | Type | Description  | Notes
------------- | ------------- | ------------- | -------------
 **project_name** | **str**| Project name | 
 **v1_create_inference_request** | [**V1CreateInferenceRequest**](V1CreateInferenceRequest.md)| Inference deployment configuration | 
 **dry_run** | **bool**| Perform validation but do not apply any changes | [optional] 

### Return type

[**V1InferenceResponse**](V1InferenceResponse.md)

### Authorization

No authorization required

### HTTP request headers

 - **Content-Type**: application/json
 - **Accept**: application/json

### HTTP response details

| Status code | Description | Response headers |
|-------------|-------------|------------------|
**200** | OK |  -  |
**204** | No Content (when dry_run is true) |  -  |
**404** | Not Found |  -  |

[[Back to top]](#) [[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to Model list]](../README.md#documentation-for-models) [[Back to README]](../README.md)

# **v1_delete_inference**
> v1_delete_inference(project_name, inference_name)

Delete inference deployment

This endpoint permanently removes a standalone inference deployment and all its resources
from all regions where it's deployed.

Inference deployments are containerized services that run machine learning models or related components.
Standalone inferences can be deleted directly using this endpoint.

When you delete an inference deployment:
- All containers running the inference are terminated
- All resources associated with the inference are released
- The inference is removed from all regions where it was deployed

Use this endpoint when you need to:
- Remove an inference that is no longer needed
- Free up resources used by an inference
- Clean up unused or obsolete inference deployments

Note: This endpoint can only delete standalone inference deployments. Inferences that are part of
application deployments from the apps catalog are read-only and cannot be deleted directly.
If you attempt to delete a read-only inference, you will receive an error. Such inferences
must be managed through the parent application deployment using the `/v1/{project_name}/apps/deployments` endpoints.

Warning: This operation cannot be undone. Make sure you no longer need the inference
and its data before deleting it.

### Example


```python
import si_tests.clients.gcore.box_api
from si_tests.clients.gcore.box_api.rest import ApiException
from pprint import pprint

# Defining the host is optional and defaults to http://localhost
# See configuration.py for a list of all supported configuration parameters.
configuration = si_tests.clients.gcore.box_api.Configuration(
    host = "http://localhost"
)


# Enter a context with an instance of the API client
with si_tests.clients.gcore.box_api.ApiClient(configuration) as api_client:
    # Create an instance of the API class
    api_instance = si_tests.clients.gcore.box_api.InferencesApi(api_client)
    project_name = 'project_name_example' # str | Project name
    inference_name = 'inference_name_example' # str | Inference deployment name

    try:
        # Delete inference deployment
        api_instance.v1_delete_inference(project_name, inference_name)
    except Exception as e:
        print("Exception when calling InferencesApi->v1_delete_inference: %s\n" % e)
```



### Parameters


Name | Type | Description  | Notes
------------- | ------------- | ------------- | -------------
 **project_name** | **str**| Project name | 
 **inference_name** | **str**| Inference deployment name | 

### Return type

void (empty response body)

### Authorization

No authorization required

### HTTP request headers

 - **Content-Type**: Not defined
 - **Accept**: application/json

### HTTP response details

| Status code | Description | Response headers |
|-------------|-------------|------------------|
**204** | No Content |  -  |
**404** | Not Found |  -  |

[[Back to top]](#) [[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to Model list]](../README.md#documentation-for-models) [[Back to README]](../README.md)

# **v1_get_inference**
> V1InferenceResponse v1_get_inference(project_name, inference_name)

Get inference deployment

This endpoint retrieves detailed information about a specific inference deployment in the project.

Inference deployments are containerized services that run machine learning models or related components.
They can be created directly or as part of an application deployment from the apps catalog.

The response includes:
- Configuration details (container image, resources, etc.)
- Status information across all regions
- Scaling configuration
- Networking and endpoint information

Note: Some inference deployments are created and managed by application deployments from
the apps catalog. These inferences are marked as read-only and cannot be modified or deleted
directly. They must be managed through the parent application deployment using the
`/v1/{project_name}/apps/deployments` endpoints.

### Example


```python
import si_tests.clients.gcore.box_api
from si_tests.clients.gcore.box_api.models.v1_inference_response import V1InferenceResponse
from si_tests.clients.gcore.box_api.rest import ApiException
from pprint import pprint

# Defining the host is optional and defaults to http://localhost
# See configuration.py for a list of all supported configuration parameters.
configuration = si_tests.clients.gcore.box_api.Configuration(
    host = "http://localhost"
)


# Enter a context with an instance of the API client
with si_tests.clients.gcore.box_api.ApiClient(configuration) as api_client:
    # Create an instance of the API class
    api_instance = si_tests.clients.gcore.box_api.InferencesApi(api_client)
    project_name = 'project_name_example' # str | Project name
    inference_name = 'inference_name_example' # str | Inference deployment name

    try:
        # Get inference deployment
        api_response = api_instance.v1_get_inference(project_name, inference_name)
        print("The response of InferencesApi->v1_get_inference:\n")
        pprint(api_response)
    except Exception as e:
        print("Exception when calling InferencesApi->v1_get_inference: %s\n" % e)
```



### Parameters


Name | Type | Description  | Notes
------------- | ------------- | ------------- | -------------
 **project_name** | **str**| Project name | 
 **inference_name** | **str**| Inference deployment name | 

### Return type

[**V1InferenceResponse**](V1InferenceResponse.md)

### Authorization

No authorization required

### HTTP request headers

 - **Content-Type**: Not defined
 - **Accept**: application/json

### HTTP response details

| Status code | Description | Response headers |
|-------------|-------------|------------------|
**200** | OK |  -  |
**404** | Not Found |  -  |

[[Back to top]](#) [[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to Model list]](../README.md#documentation-for-models) [[Back to README]](../README.md)

# **v1_get_inference_api_key**
> V1InferenceAPIKeySecretResponse v1_get_inference_api_key(project_name, inference_name)

Get inference API key

This endpoint retrieves the API key that can be used to authenticate requests to the inference deployment.
The API key is only available if API key authentication is enabled for the inference deployment.

Inference deployments are containerized services that run machine learning models or related components.
Both standalone inferences and those that are part of application deployments can have API keys.

Use this endpoint when you need to:
- Retrieve the API key for authenticating requests to an inference
- Set up client applications to communicate with the inference
- Configure tools or services that need to access the inference

Note: This endpoint will return an error if API key authentication is disabled for the inference.
This endpoint works for both standalone inferences and those that are part of application deployments
from the apps catalog, even though the latter are read-only for other operations.

### Example


```python
import si_tests.clients.gcore.box_api
from si_tests.clients.gcore.box_api.models.v1_inference_api_key_secret_response import V1InferenceAPIKeySecretResponse
from si_tests.clients.gcore.box_api.rest import ApiException
from pprint import pprint

# Defining the host is optional and defaults to http://localhost
# See configuration.py for a list of all supported configuration parameters.
configuration = si_tests.clients.gcore.box_api.Configuration(
    host = "http://localhost"
)


# Enter a context with an instance of the API client
with si_tests.clients.gcore.box_api.ApiClient(configuration) as api_client:
    # Create an instance of the API class
    api_instance = si_tests.clients.gcore.box_api.InferencesApi(api_client)
    project_name = 'project_name_example' # str | Project name
    inference_name = 'inference_name_example' # str | Inference deployment name

    try:
        # Get inference API key
        api_response = api_instance.v1_get_inference_api_key(project_name, inference_name)
        print("The response of InferencesApi->v1_get_inference_api_key:\n")
        pprint(api_response)
    except Exception as e:
        print("Exception when calling InferencesApi->v1_get_inference_api_key: %s\n" % e)
```



### Parameters


Name | Type | Description  | Notes
------------- | ------------- | ------------- | -------------
 **project_name** | **str**| Project name | 
 **inference_name** | **str**| Inference deployment name | 

### Return type

[**V1InferenceAPIKeySecretResponse**](V1InferenceAPIKeySecretResponse.md)

### Authorization

No authorization required

### HTTP request headers

 - **Content-Type**: Not defined
 - **Accept**: application/json

### HTTP response details

| Status code | Description | Response headers |
|-------------|-------------|------------------|
**200** | OK |  -  |
**404** | Not Found (API key is disabled or inference doesn&#39;t exist) |  -  |

[[Back to top]](#) [[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to Model list]](../README.md#documentation-for-models) [[Back to README]](../README.md)

# **v1_get_inference_logs**
> V1InferenceLogsResponse v1_get_inference_logs(project_name, inference_name, region=region, limit=limit, order_by=order_by)

Get inference logs

This endpoint retrieves the logs from the containers running the inference deployment.
You can filter logs by region, limit the number of returned records, and specify the sort order.

Inference deployments are containerized services that run machine learning models or related components.
Both standalone inferences and those that are part of application deployments generate logs.

The logs provide valuable information for:
- Debugging issues with the inference deployment
- Monitoring the performance and behavior of the model
- Troubleshooting errors or unexpected results
- Analyzing usage patterns and request handling

Use this endpoint when you need to:
- Diagnose problems with an inference deployment
- Monitor the activity of an inference
- Collect logs for analysis or reporting

Note: This endpoint works for both standalone inferences and those that are part of application deployments
from the apps catalog, even though the latter are read-only for other operations.

### Example


```python
import si_tests.clients.gcore.box_api
from si_tests.clients.gcore.box_api.models.v1_inference_logs_response import V1InferenceLogsResponse
from si_tests.clients.gcore.box_api.rest import ApiException
from pprint import pprint

# Defining the host is optional and defaults to http://localhost
# See configuration.py for a list of all supported configuration parameters.
configuration = si_tests.clients.gcore.box_api.Configuration(
    host = "http://localhost"
)


# Enter a context with an instance of the API client
with si_tests.clients.gcore.box_api.ApiClient(configuration) as api_client:
    # Create an instance of the API class
    api_instance = si_tests.clients.gcore.box_api.InferencesApi(api_client)
    project_name = 'project_name_example' # str | Project name
    inference_name = 'inference_name_example' # str | Inference deployment name
    region = 'region_example' # str | Filter by region name(s) (optional)
    limit = 'limit_example' # str | Limit the number of returned log records (optional)
    order_by = 'order_by_example' # str | Sort order of results (time.asc or time.desc) (optional)

    try:
        # Get inference logs
        api_response = api_instance.v1_get_inference_logs(project_name, inference_name, region=region, limit=limit, order_by=order_by)
        print("The response of InferencesApi->v1_get_inference_logs:\n")
        pprint(api_response)
    except Exception as e:
        print("Exception when calling InferencesApi->v1_get_inference_logs: %s\n" % e)
```



### Parameters


Name | Type | Description  | Notes
------------- | ------------- | ------------- | -------------
 **project_name** | **str**| Project name | 
 **inference_name** | **str**| Inference deployment name | 
 **region** | **str**| Filter by region name(s) | [optional] 
 **limit** | **str**| Limit the number of returned log records | [optional] 
 **order_by** | **str**| Sort order of results (time.asc or time.desc) | [optional] 

### Return type

[**V1InferenceLogsResponse**](V1InferenceLogsResponse.md)

### Authorization

No authorization required

### HTTP request headers

 - **Content-Type**: Not defined
 - **Accept**: application/json

### HTTP response details

| Status code | Description | Response headers |
|-------------|-------------|------------------|
**200** | OK |  -  |
**404** | Not Found |  -  |

[[Back to top]](#) [[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to Model list]](../README.md#documentation-for-models) [[Back to README]](../README.md)

# **v1_list_inferences**
> V1ListInferenceResponse v1_list_inferences(project_name)

List inference deployments

This endpoint provides a summary of all inference deployments in the project, including their names,
configurations, and current status across all regions where they're deployed.

Inference deployments are containerized services that run machine learning models or related components.
They can be created directly or as part of an application deployment from the apps catalog.

Use this endpoint when you need to:
- Get an overview of all inference deployments in a project
- Monitor the status of your inference deployments
- Find specific inference deployments by name or configuration

Note: The list will include both standalone inference deployments and those that are part of
application deployments from the apps catalog. Inferences that are part of app deployments
are marked as read-only and cannot be modified or deleted directly. They must be managed
through the parent application deployment using the `/v1/{project_name}/apps/deployments` endpoints.

### Example


```python
import si_tests.clients.gcore.box_api
from si_tests.clients.gcore.box_api.models.v1_list_inference_response import V1ListInferenceResponse
from si_tests.clients.gcore.box_api.rest import ApiException
from pprint import pprint

# Defining the host is optional and defaults to http://localhost
# See configuration.py for a list of all supported configuration parameters.
configuration = si_tests.clients.gcore.box_api.Configuration(
    host = "http://localhost"
)


# Enter a context with an instance of the API client
with si_tests.clients.gcore.box_api.ApiClient(configuration) as api_client:
    # Create an instance of the API class
    api_instance = si_tests.clients.gcore.box_api.InferencesApi(api_client)
    project_name = 'project_name_example' # str | Project name

    try:
        # List inference deployments
        api_response = api_instance.v1_list_inferences(project_name)
        print("The response of InferencesApi->v1_list_inferences:\n")
        pprint(api_response)
    except Exception as e:
        print("Exception when calling InferencesApi->v1_list_inferences: %s\n" % e)
```



### Parameters


Name | Type | Description  | Notes
------------- | ------------- | ------------- | -------------
 **project_name** | **str**| Project name | 

### Return type

[**V1ListInferenceResponse**](V1ListInferenceResponse.md)

### Authorization

No authorization required

### HTTP request headers

 - **Content-Type**: Not defined
 - **Accept**: application/json

### HTTP response details

| Status code | Description | Response headers |
|-------------|-------------|------------------|
**200** | OK |  -  |
**404** | Not Found |  -  |

[[Back to top]](#) [[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to Model list]](../README.md#documentation-for-models) [[Back to README]](../README.md)

# **v1_update_inference**
> V1InferenceResponse v1_update_inference(project_name, inference_name, v1_update_inference_request, dry_run=dry_run)

Update inference deployment

This endpoint allows you to modify the configuration of an existing standalone inference deployment,
including its container image, resource requirements, scaling options, and networking settings.
You can also update which regions the inference is deployed to.

Inference deployments are containerized services that run machine learning models or related components.
Standalone inferences can be updated directly using this endpoint.

Use this endpoint when you need to:
- Change the container image or version of an inference
- Modify resource allocations (CPU, memory, GPU)
- Update scaling parameters
- Change the regions where an inference is deployed
- Modify environment variables or other configuration

Note: This endpoint can only update standalone inference deployments. Inferences that are part of
application deployments from the apps catalog are read-only and cannot be modified directly.
If you attempt to update a read-only inference, you will receive an error. Such inferences
must be managed through the parent application deployment using the `/v1/{project_name}/apps/deployments` endpoints.

### Example


```python
import si_tests.clients.gcore.box_api
from si_tests.clients.gcore.box_api.models.v1_inference_response import V1InferenceResponse
from si_tests.clients.gcore.box_api.models.v1_update_inference_request import V1UpdateInferenceRequest
from si_tests.clients.gcore.box_api.rest import ApiException
from pprint import pprint

# Defining the host is optional and defaults to http://localhost
# See configuration.py for a list of all supported configuration parameters.
configuration = si_tests.clients.gcore.box_api.Configuration(
    host = "http://localhost"
)


# Enter a context with an instance of the API client
with si_tests.clients.gcore.box_api.ApiClient(configuration) as api_client:
    # Create an instance of the API class
    api_instance = si_tests.clients.gcore.box_api.InferencesApi(api_client)
    project_name = 'project_name_example' # str | Project name
    inference_name = 'inference_name_example' # str | Inference deployment name
    v1_update_inference_request = si_tests.clients.gcore.box_api.V1UpdateInferenceRequest() # V1UpdateInferenceRequest | Updated inference deployment configuration
    dry_run = True # bool | Perform validation but do not apply any changes (optional)

    try:
        # Update inference deployment
        api_response = api_instance.v1_update_inference(project_name, inference_name, v1_update_inference_request, dry_run=dry_run)
        print("The response of InferencesApi->v1_update_inference:\n")
        pprint(api_response)
    except Exception as e:
        print("Exception when calling InferencesApi->v1_update_inference: %s\n" % e)
```



### Parameters


Name | Type | Description  | Notes
------------- | ------------- | ------------- | -------------
 **project_name** | **str**| Project name | 
 **inference_name** | **str**| Inference deployment name | 
 **v1_update_inference_request** | [**V1UpdateInferenceRequest**](V1UpdateInferenceRequest.md)| Updated inference deployment configuration | 
 **dry_run** | **bool**| Perform validation but do not apply any changes | [optional] 

### Return type

[**V1InferenceResponse**](V1InferenceResponse.md)

### Authorization

No authorization required

### HTTP request headers

 - **Content-Type**: application/json
 - **Accept**: application/json

### HTTP response details

| Status code | Description | Response headers |
|-------------|-------------|------------------|
**200** | OK |  -  |
**204** | No Content (when dry_run is true) |  -  |
**404** | Not Found |  -  |

[[Back to top]](#) [[Back to API list]](../README.md#documentation-for-api-endpoints) [[Back to Model list]](../README.md#documentation-for-models) [[Back to README]](../README.md)

