Data sets

Limited access API

Limited access

This feature is only available to root and super_admin_domain profiles. Ask your administrator for proper user role profiling.

Data sets

A data set is a group of values collected from remote devices in a chosen organization and projected as a table. You can configure OpenGate to fill each data set column by selecting the data streams available in your organization. The column values can only be one of the following types: strings, numbers, or booleans. For this reason, if you select a data stream defined by an object or array schema, you must define a path to reaching one of the basic values mentioned before. Please also consider that if your devices have some communication modules, you must specify a column for each communication module data that you want to see.

In this section, you can learn how to administer or read data in a dataset.

Provisioning

JSON configuration

Identifier column

When a time series is defined, it is required to define the identifier column. This column represents provision.administration.identifier._current.value path with filter=YES and sort=true.

Column path field

The column path field consists of three parts. The last part is only required if the selected data stream, represented by its datastreamId field, has an array or object schema:

  • Data stream identifier: You must select the datastreamId you want to see. If the data stream id contains communicationModules[], you will also select the index of the communication module that you want to see. _Example: device.communicationModules[0].subscription.mobile.imsi._current.value.
  • Field of data streams: This can be one of the properties of any data stream. These options are:
    • _current.value
    • _current.date
    • _current.at
    • _current.source
    • _current.sourceInfo
  • Value path: If the data stream scheme is not a basic type, you must select a path reaching a field with a basic type.

Updating datasets

You can edit several data sets’ features, but this may affect the data stored in them or their structure, requiring a process to adapt the information to what was needed for the request. Until this process is finished, dirty values will exist.

Fields which can be modified are:

  • Name
  • Description
  • IdentifierColumn
  • Columns (you can add, remove, and modify columns)

Column changes considerations:

  • The name of a column must be unique, so if you want to add or rename a column, you can not choose a name that is in use.
  • Adding or removing columns with the ‘filter’ property’s value ‘ALWAYS’ is impossible. Furthermore, it is impossible to modify this property’s value in one existing column to ‘ALWAYS’ if it was a different one. If that is its current value, you can not set it to another.
  • You can not edit the path of a column. Instead, you can remove and create the column, getting the same result.

Reading data

Searching resources for data sets differs from the rest of the search resources. You must select the data set’s identifier in the URL you request searching. The body request is very similar to any other search:

  • Filter and sort have the same format, using identifierColumn | columns.name like the key.
  • Limit has the same format, too.
  • Group does not exist in dataset searching
  • Select is different because you must only write columns.name in an array type to define the select option.

When JSON format is selected for the response in the data set search, it differs significantly from other searches. In this case, searching returns a JSON with two fields:

  • columns: This field is an array property. If field selection is requested, then this property has the same content. However, if the search doesn’t use the select clause, this property will contain in the first position identifier column, followed by defined columns.name in datasets in the same order.
  • data: Array property where each item represents a row of data set and the value is another array with each defined column in previously the row property.

In addition, you can retrieve this information in a CSV format. For this option, the user should keep in mind the following considerations:

  • By default, any text value will be enclosed in double-quotes. Example: “text value”. It is possible to change the double quotes for any other character.
  • Although special characters will be escaped using the backslash character (’\’), it is possible to define another for that goal.
  • Another aspect that can be customized is the ’end of line character’, which by default is ‘\n.’
  • An empty text will represent null values, and this feature can also be changed.

WARNING Our Data Sets API is meticulously designed to cater to your advanced data analysis and manipulation needs. Recognizing the importance of performance and efficiency when you retrieve data in CSV format, we streamline the process by turning off sorting features. This intentional design choice significantly enhances the data retrieval speed, allowing you to access large data sets swiftly. Typically, data downloaded in CSV format is used for subsequent in-depth analysis and manipulation. By optimizing the performance in this manner, we empower you to focus on what truly matters - diving into your data and extracting valuable insights without unnecessary delays. Embrace the advantage of efficiency with our tailored solution, ensuring that your data analysis journey is smooth and productive.

All these configurations can be made using the header parameter designed for that purpose. Still, the user is responsible for the resulting CSV response and whether it has a correct format.

Sorting limitations

By definition and for taking care of the performance, it is possible to set a maximum of three columns as sortable. When making a data search request, you can sort by identifierColumn and the sortable columns specified, but in each request, only one or two of them are allowed.

For example, if there is a dataset with the columns A, B, C, and D, and its sortable columns are A, B, and C, valid sorting combinations would be (A), (A, B), (B, A), (identifierColumn, A) or (C, identifierColumn). Wrong combinations would be (A, D), (identifierColumn, D), (identifierColumn, A, B) or (A, B, C).

Limit specification

Depending on response type, pagination specification and behavior can be different:

  • JSON response: If limit fields are not specified, default values (specified in configuration) will be assigned. If defined, it will be validated with configuration values.
  • CSV response: If the limit field is not defined in the request body, it means that complete CSV retrieval is wanted. If the Limit field is defined but incomplete, an error will be returned indicating that all fields must be determined correctly.

WARNING Querying for complete data can take too long. Use this option carefully.

Example reading collected data in CSV format

The following code snippet shows an example using the limit sub-document in the JSON used to download the data.

{
  "filter": {},
  "limit": {
    "size": 500,
    "start": 1
  }
}

In the following case, complete data will be retrieved.

{
  "filter": {}
}

Finally, this is an example of an invalid query in CSV:

{
  "filter": {},
  "limit": {}
}

API specification