Source
A Source in Open Connector represents a digital endpoint that can be used for data exchange. Sources can function both as data providers and destinations, enabling bidirectional data flow. The actual data transfer process is managed through Synchronizations.
Types of Sources
Open Connector supports three main types of sources:
-
Nextcloud Files
- Internal file system integration
- Direct access to files within your Nextcloud environment
-
External APIs
- REST/HTTP-based services
- Third-party application interfaces
- Cloud services
- External files like CSV, JSON, XML, etc.
-
External Databases (Deprecated)
⚠️ Note: Database connections are deprecated. For database interactions, we recommend using Open Registers as a more robust solution.
Core Concepts
Purpose
A Source defines:
- The location of the data endpoint
- Authentication methods
- Connection parameters
- Basic interaction rules
Authentication
Sources can use various authentication methods:
- API Keys
- OAuth2
- Basic Authentication
- Custom Headers
Procces
Calls to a source are made trought the open conncetor call service, this is essentiall a wrapper around the guzzle http client. But it gives some adtional features like logging, caching and error handling and dealing with pagination ans xml files. The call service is a technical service that takes as input a source object (and optionally an endpoint, method and configuration if the default ones need to be overwritten) and returns and array of objects as a result. The call service is used by the synchronysations to get the data from the source.
Configuration
Sources have the following configurable properties:
Property | Type | Description | Default |
---|---|---|---|
uuid | string | Unique identifier for the source | null |
name | string | Display name of the source | null |
description | string | Detailed description of the source | null |
reference | string | External reference identifier | null |
version | string | Version number of the source | '0.0.0' |
location | string | URL or path to the source | null |
isEnabled | boolean | Whether the source is active | null |
type | string | Type of source (api, file or database) | null |
loggingConfig | array | Logging configuration | null |
configuration | array | General configuration | null |
rateLimitLimit | integer | Total allowed requests per period | null |
rateLimitRemaining | integer | Remaining allowed requests | null |
rateLimitReset | integer | Unix timestamp for limit reset | null |
rateLimitWindow | integer | Seconds between requests | null |
lastCall | datetime | Timestamp of last API call | null |
lastSync | datetime | Timestamp of last sync | null |
dateCreated | datetime | Creation timestamp | null |
dateModified | datetime | Last modified timestamp | null |
Sources are configured using Guzzle HTTP client options. These settings control how Open Connector interacts with the source. That means that the configuration is passed as an array to the Guzzle client with one aditiondal option and thats that the method of the call can be set trought the method property (defaults to GET
). The baser_uri is overwritten by the location property of the source.
Example Source Configurations
{
"base_uri": "location-from-source",
"method": "GET",
"headers": {
'Authorization' => 'Bearer your-token',
'Accept' => 'application/json',
],
'timeout' => 30,
]
For detailed configuration options, refer to the Guzzle Documentation.
Dealing with pagination
Open Connector provides robust support for handling paginated API responses. The system can automatically detect and handle various pagination patterns commonly used in APIs.
Pagination Configuration
Pagination can be configured in the source configuration using the following structure:
{
"pagination": {
"type": "offset", // Type of pagination (offset, page, cursor)
"paginationQuery": "page", // Query parameter for page
"limitQuery": "limit", // Query parameter for items per page
"page": 1, // Starting page/offset (defaults to 1)
"limit": 100, // Items per page (defaults to 100)
"totalPages": "meta.pages", // Path to total pages in response
"totalItems": "meta.total", // Path to total items in response
"cursor": "meta.next_cursor", // Path to next cursor (for cursor pagination)
"maxPages": 1000 // Maximum number of pages to fetch (defaults to 1000)
}
}
None of these are required, the call service will try to detect the pagination type automatically.
Supported Pagination Types
- Offset
- Page
- Cursor
Response Handling
APIs typically return data in a wrapped format. The Call Service needs to know where to find the actual result objects. Here are common patterns:
-
Detecting the response wrapper structure
- Looks for common result properties in order:
results
items
result
data
- Throws error if no valid wrapper property found
- Can be configured to handle unwrapped responses via
_root
setting
- Looks for common result properties in order:
-
Afther the call service has fetched the data it will return an array of objects on the current page. The call service will then try to dermine if the result is paginated. It will assume that it always starts on page 1 of a paginated responce and the look if:
- The next cursor (from
next
,next_page
,next_cursor
,nextPage
,nextCursor
) is pressent and an url. It wil then follow the url and fetch the next page. - The total number of results (from
total
,totalResults
,total_results
,totalResults
,total_items
,totalItems
,total_data
,totalData
) is greater than the current number of results - The next cursor (from
next
,next_page
,next_cursor
,nextPage
,nextCursor
) is higher then the current page counter (if tis a integer). - The current page is less then the total number of pages (from
totalPages
,total_pages
,pages
)
- The next cursor (from
If the call service detects that the result is paginated it will fetch the next page and return the results. It will also update the source with the new page number and total number of pages.
It will continue in this loop until it has fetched all the pages and returned all the results, the configuration.pagination.maxPages
limit has been reachd (that defaults to 1000) or the source has been stopped responding.
- Results property
- Items property
- Result property
- Data property
- No property (root data)
{
"results": [
// ... items ...
]
}
{
"items": [
// ... items ...
]
}
{
"result": [
// ... items ...
]
}
{
"data": [
// ... items ...
]
}
Note: This format only works when 'configuration.results' is set to 'root'
{
// ... items ...
}
Best Practices
-
Performance
- Set appropriate page sizes
- Use cursor-based pagination when available
- Consider implementing caching for paginated results
-
Error Handling
- Implement proper retry logic between pages
- Handle incomplete page fetches gracefully
- Log pagination-related errors separately
-
Rate Limiting
- Account for rate limits across paginated requests
- Implement appropriate delays between requests
- Monitor API quotas during pagination
-
Memory Management
- Consider streaming for large datasets
- Implement batch processing when needed
- Monitor memory usage during pagination
Best Practices
-
Security
- Always use environment variables for sensitive credentials
- Implement proper error handling for connection failures
- Use appropriate timeout values
-
Performance
- Configure appropriate cache settings
- Use pagination when dealing with large datasets
- Set reasonable timeout values
-
Maintenance
- Regularly validate source connections
- Monitor API rate limits
- Keep authentication credentials up to date
Related Concepts
- Synchronizations: Define how data flows between sources
- Transformations: Specify how data should be modified during transfer
- Mappings: Define relationships between source and destination data structures
Automatic Pagination Detection
Open Connector can automatically detect pagination in API responses through multiple methods:
-
Configuration-based Detection
- Checks configured paths for pagination information
- Uses specified metadata locations
- Follows defined pagination patterns
-
Pattern-based Detection Common patterns automatically detected:
{
"_links": {
"next": "https://api.example.com/items?page=2",
"prev": "https://api.example.com/items?page=1"
}
}{
"next_page_url": "https://api.example.com/items?page=2",
"prev_page_url": "https://api.example.com/items?page=1"
}{
"pagination": {
"more_items": true,
"next_cursor": "dXNlcjpXMDdRQ1JQQTQ="
}
} -
Count-based Detection The system analyzes:
- Result count versus specified limits
- Total items versus received items
- Collection size patterns
Example:
{
"meta": {
"total": 150,
"per_page": 50
},
"data": [ /* 50 items */ ]
}Here, the system detects pagination because:
- Total (150) > received items (50)
- Received items matches per_page limit
- Collection size suggests more pages
-
Response Structure Analysis
- Detects standard REST pagination patterns
- Identifies cursor-based pagination markers
- Recognizes offset/limit patterns
Automatic Handling
When pagination is detected, Open Connector:
- Determines the appropriate pagination type
- Automatically fetches subsequent pages
- Handles rate limiting between requests
- Merges results maintaining order
- Removes potential duplicates
- Provides progress updates
- Manages memory efficiently
Error Handling During Pagination
The system implements robust error handling:
- Retries failed page requests
- Maintains partial results on failures
- Provides detailed error reporting
- Implements backoff strategies
- Preserves successfully retrieved data