Internal database¶
Warning
Prior to the v2.5, Fast2 was relying on an Elasticsearch database. This component has been dropped in favor of OpenSearch.
However the configuration of these two databases are very close (if not identical).
Every object passing through Fast2 is stored into an internal database. Whether carried by the document or the punnet, all metadata are recorded in the warehouse. The major benefit of such architecture is the opportunity to check whether everything is going well by making counters about documents/data processed during your migrations.
In addition, we can easily rollback or resume operations in case of server crash. Nothing will be lost as Fast2 will precisely know where it all stopped.
There is the logic behind real-time backups in ES.
Indexes¶
Each Elasticsearch index referenced by Fast2 will be registered with a f2_
prefix. An index is always written in lower case even if the campaign name in Fast2 contains characters in upper cases.
For example, the campaign MyCampaign_Try10
will be stored in the index f2_mycampaign_try15.
During the step of broker intantiation at Fast2 startup, some indices are automatically created:
Index key | Description |
---|---|
f2_campaigns | List of existing campaigns with processing dates and status |
f2_campaigns_sources | Links between campaigns and workers having performed this campaign |
f2_queue_settings | Reference information about source and task threads |
f2_jobs_settings | Gather the configuration of save jobs |
f2_jobs_info | Information about jobs past execution details |
For each new campaign of Fast2, an index will be created: if you decided to run a new campaign named EcmInjection
, the new index will be f2_ecminjection_try1
.
Configuration¶
With or without¶
For an optimal migration setup, this third-party software can be easily configured at different levels to match you needs at most ! If required, it can even be disabled at will.
Port¶
By default, Fast2 sends it data via the Elasticsearch API made available on port 1790
.
However, in the case where this port is already used by either another Fast2 instance or any other process, the port number can be changed from the configuration files.
Since Elasticsearch has to be reach from both Fast2 broker and Kibana module — if the latter is enabled. If any Elasticsearch port had to be reconfigured to match specific needs, there is exactly 3 places where to mention this change:
File | Specification |
---|---|
./config/application.properties | opensearch.port=1790 |
./elasticsearch-X.Y.Z/config/elasticsearch.yml | http.port: <es-port> |
./kibana-X.Y.Z/config/kibana.yml | elasticsearch.hosts: ["http://localhost:<es-port>"] |
File | Specification |
---|---|
./config/application.properties | opensearch.port=1790 |
./opensearch-X.Y.Z/config/opensearch.yml | http.port: <es-port> |
./opensearch-dashboards-X.Y.Z/config/opensearch_dashboards.yml | opensearch.hosts: ["http://localhost:<es-port>"] |
If the dashboard component is installed, the database port also needs to be updated on this front as the dashboard needs to access the DB in order to read the data :
Memory¶
The more documents, the more data. The more data, the more Elasticsearch will need resources to digest, store, process data and respond to the broker.
Head out to the ./opensearch-X.Y.Z/config/jvm.options
file.
The configuration required are the following:
Configuration | Purpose |
---|---|
-Xms8g |
This setting will allocate 8GB of RAM to the Elasticsearch JVM heap, directly on startup. |
-Xmx8g |
Here, you specify the maximum memory which can be used, if available, by the database. |
Warning
As specified in the ./opensearch-X.Y.Z/config/jvm.options
file, you should always set the min and max JVM heap size to the same value.
See the Official OpenSearch documentation for more information.
For further comprehension of these parameters, check out the Elasticsearch official documentation on the topic or OpenSearch official documenation.
Upgrading the metrics will prevent java.lang.OutOfMemoryError
to pop up during heavy migration executions.
Remote access to the database¶
The next operations need to happen when the database is shut down. To make sure of that, the jcmd
command might be of great help.
Warning
The database port needs to be opened from the Fast2 server, and accessible by your remote machine.
-
To check the database port is accessible from your machine, run the following command (from your work station):
-
Head out to the database configuration YAML file
opensearch.yml
and add the following lines: -
If Fast2 is installed on a Linux server, you may also need to increase the memory usage for your cluster (on the server where the database is running), as stipulated in the Official OpenSearch documentation.
-
Save the file, and run the following command to refresh your server configuration :
-
Restart the broker (which will induce the bootup of the database), and go check from your machine the access to the Fast2 database.
The webpage (at the same URL we
curl
-ed in the 1st step) should display something like so :
Troubleshooting¶
In reason of the tight commmunication between the broker and the database, chances are you will soon be reported 500 server error generated by unsuccessful exchanges of the two entities.
Server error 500 when starting a campaign¶
After running quite a bunch of campaigns, you might end up not being able to start anymore of them due to the limit of shards of the Elasticsearch database (for more in-depth details about the shards, checkout the Official Elasticsearch documentation).
The symptom of this limitations comes as a regular 500 server error toast in the UI, but is is by checking the logs/broker.log file that its raw nature is exposed:
17:24:45.017 [http-nio-1789-exec-17] ERROR org.apache.catalina.core.ContainerBase.[Tomcat].[localhost].[/]:175 - Exception while dispatching incoming RPC call com.google.gwt.user.server.rpc.UnexpectedException: Service method 'public abstract com.fast2.model.taskflow.Campaign com.fast2.hmi.gwt.client.service.GWTCampaignManager.startProcessing(
com.fast2.model.taskflow.Campaign,com.fast2.model.taskflow.design.TaskFlowMapRef,boolean)' threw an unexpected exception:
java.lang.RuntimeException: Caught exception Elasticsearch exception [type=validation_exception, reason=Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [1000]/[1000] maximum shards open;]
...
Caused by: org.elasticsearch.ElasticsearchStatusException: Elasticsearch exception [type=validation_exception, reason=Validation Failed: 1: this action would add [2] total shards, but this cluster currently has [1000]/[1000] maximum shards open;]
As mentioned in the Elasticsearch technicalities, Fast2 records data under indices prefixed with f2_
. Thus it implies to begin each index to delete with this prefix.
Although a drastic cleanup induced by a curl -X DELETE -i "http://<elasticsearch-server>:<elasticsearch-port>/f2_*
would resolve our issue, you might be interested in keeping some campaigns or indices. As any curl query allows, exceptions can be added to the deletion operation to prevent them from being removed of the backup database. The syntax goes as follows:
curl -X DELETE -i "http://<Fast2-server>:<database-port>/f2_*,-f2_campaigns,-f2_campaigns_sources[,-f2_<campaign-name>]"
Let us now study this query:
Section | Purpose |
---|---|
http://server:port/f2_* | Here we ask to aim at all indices starting with the f2_ prefix. This will prevent the deletion of additional indices which could be related to parallel work on the same Elasticsearch instance, such as Kibana reports, charts or data analysis. |
-f2_campaigns,-f2_campaigns_sources |
These 2 indices are needed if you decide to keep any other campaign. The - sign declares them as exception from the delete action. |
[,-f2_<campaign-name>] |
Then you can list all the campaigns you are willing to preserve,
|
Wildcards are supported, therefore an exception written -f2_mycampaign*
will protect all the campaign with this myCampaign radical (ex/ myCampaign_Try1, myCampaign_Try2...).
Impossible to get results from Explorer place¶
Documents concerned by a migration can quickly add up, especially on cumulative campaigns. When heading to the Explorer to check punnets results, you may end up with too much information to retrieve from the database. An "error 500" message will then pop up, and if yu head out to the broker.log, the following stack will be found:
Suppressed: org.elasticsearch.client.ResponseException: method [POST], host [http://localhost:1790], URI [/f2_campaign_try1/_search?pre_filter_shard_size=...], status line [HTTP/1.1 503 Service Unavailable]
{"error":{"root_cause":[{"type":"too_many_buckets_exception","reason":"Trying to create too many buckets. Must be less than or equal to: [10000] but was [10001]. This limit can be set by changing the [search.max_buckets] cluster level setting.","max_buckets":10000}],"type":"search_phase_execution_exception","reason":"all shards failed","phase":"query","grouped":true,"failed_shards":[{"shard":0,"index":"f2_stg-100k_try1","node":"4XM6VD2wTfeOSpr2brIs7g","reason":{"type":"too_many_buckets_exception","reason":"Trying to create too many buckets. Must be less than or equal to: [10000] but was [10001]. This limit can be set by changing the [search.max_buckets] cluster level setting.","max_buckets":10000}}]},"status":503}
To fix this, you need to stop Fast2 (and the database), and add the following line:
Let's quickly wrap up, here¶
This integrated database guarantees data persistence to Fast2. If required, an Elasticsearch database can be shared among several Fast2 servers.
Allocated resources should be increased in order to resist charge of production environments.