Skip to main content
Skip table of contents

IMOS - Data Lake

Home > Data Lake

This is an additional solution and requires a separate license.

The Veson IMOS Platform (IMOS) Data Lake is a highly scalable cloud solution that delivers a holistic view of a client's data in an easy-to-consume format.

It is intended for Business/Data Analysts, Project Managers, Heads of IT, or anyone who is responsible for leveraging vast quantities of historical data to recommend justifiable courses of action to maximize performance in the present.

For more information on how to use the Data Lake, see Data Lake Best Practices.

How It Works

The Data Lake provides table extracts from a client's operational database in the Veson IMOS Platform, transforms those extracts into Report Designer reporting format, and places the transformed extracts into a secure, encrypted download location. The secure location is accessible only to the client and is protected by an industry-standard authentication mechanism. To ensure optimal download performance, the data is stored in the same geographic region as the client: North America, Europe, or Asia-Pacific.

With the data easily accessible and in a standardized format, clients typically choose to ingest the information into a downstream data warehouse or business intelligence (BI) solution.

Clients may download the content from their secure location on a regular basis. For convenience, seven days' worth of data is retained in the Data Lake at all times. By default, the data is refreshed every 24 hours; upon request, the data can be refreshed as frequently as once per hour.

The Data Lake is managed by Veson Nautical and is monitored in real-time.

Frequently Asked Questions

How is the Data Lake different than replicating the IMOS operational database to on-premise?

On-premise database replication facilitates direct queries and reports run against the replicated (on-premise) database, to feed BI and data warehouse solutions, or other in-house systems. The replication processes can be labor intensive and unreliable for large data sets, and the client must continue to modify their reporting and data transformation processes when the source Veson IMOS Platform database schema changes.

Data Lake solves this issue in two ways. First, Data Lake transforms the Veson IMOS Platform database extracts into Report Designer format prior to making the extracts available to clients, which makes them immune to schema changes. Second, the extracts are sent to the client on a daily or hourly basis, making database replication unnecessary.

Replication is also used traditionally for business continuity, for times when one data center is not available, clients can continue to access data in a replica. Amazon Web Services (AWS) provides redundancy that removes the likelihood of a single point of failure in any part of the Veson IMOS Platform stack (storage, compute, networking), so traditional on-premise replication for business continuity and disaster recovery is not necessary.

How is the Data Lake different than taking a backup of the IMOS operational database?

Backup and restore operations are typically undertaken in response to the corruption or loss of a server, or to archive the database data at regular intervals for analysis and/or compliance requirements.

Data Lake provides similar functionality by extracting the Veson IMOS Platform database tables on a daily or hourly basis and making them available to clients to import into their reporting, BI, or data warehouse solution. AWS provides the redundancy that removes the likelihood of a single point of failure in any part of the Veson IMOS Platform stack (storage, compute, networking), rendering backup and restore for data recovery unnecessary.

Does Data Lake extract the entire IMOS operational database each time it runs, or just incremental changes?

Each extraction from the Veson IMOS Platform operational database is a full load of the current database tables. After transformation, the extract files are compressed to ~90% of their original size, then delivered to the client’s directory for download.

Does a client need to prepare an ETL (Extract, Transform, Load) in order to create a proper data warehouse?

Yes. Clients will need to transform the data they download from the Reporting Repository, and the target format will depend on the data warehouse or BI solution they are using.

What if the source data changes between pulls?

The extracted files represent the state of the client’s Veson IMOS Platform operational database at the point of extraction. The client is responsible for adapting the data in the destination system to account for changes between extract file versions.

If the client is concerned about data change frequency, they are likely a better fit for the hourly update model.

What is the difference between the existing Data Warehouse Connector and the Data Lake offering?

The existing Data Warehouse Connector license enables a client to connect Report Designer report output to an external BI or analytics system. Technically, it is an API connector, whereas the Data Lake is a set of files extracted from all the information in the client’s Veson IMOS Platform operational database. Therefore, the scope and scale of the Data Lake are vastly larger than the Data Warehouse Connector.

Will unique changes/additions made to the Report Designer schema be reflected in Data Lake?

Yes. Data Lake output will match the Report Designer schema, so any additional fields that the client wants to include in the Data Lake output should be added to the Report Designer schema, and those additional fields will be available for inclusion in Report Designer reports as well.

The Data Lake mechanism takes transactional DB Tables and transforms them into a large .form file that resembles the report designer schema. Therefore, if a column that is missing in Report Designer, it will be missing in the final output that goes into the s3 bucket as part of this setup.

Does Veson Nautical host the clients’ data warehouse?

No. We are not currently planning to offer a hosted data warehouse as part of this solution.

Can my organization specify the time of day for the Data Lake extraction?

No. The extraction schedule is global and cannot be modified on a per-client basis.

Does the Data Lake extract represent a complete copy of the IMOS Data Dictionary?

No. The Data Dictionary is another name for our Report Designer schema (also known as the Data Map); it is the complete schema for BI reporting. The Data Lake uses a subset of this schema. Data Lake and Data Dictionary are different by design, since the Report Designer has predefined joins that are not relevant to a BI reporting scenario where you join the data yourself, outside of the system.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.