Skip to main content

Datalake

The epilot Datalake streams real-time events for entity operations, workflow executions, journey analytics, and portal analytics into a ClickHouse data warehouse. Connect any BI tool to build reports and dashboards.

info

Contact customer support or sales to enable the Datalake for your organization.

Data Schemas​

Four datasets are available:

1. Entity Operations Event Streams​

Real-time event stream of entity create, update, and delete operations.

{
"activity_id": "string", // ID for the individual entity operation/activity
"entity_id": "string", // ID of the entity being mutated
"org_id": "string", // Organization ID
"operation": "string", // Operation can be "createEntity," "updateEntity," or "deleteEntity"
"schema": "string", // Schema of the entity
"timestamp": "DateTime", // Timestamp of the entity mutation
"detail": "string", // Stringified JSON payload containing entity data
}
  • operation -- createEntity, updateEntity, or deleteEntity
  • timestamp -- when the operation occurred (useful for time-series reports)
  • activity_id / entity_id -- identifiers for the operation and affected entity
  • detail -- stringified JSON of the full entity state at the time of the operation
tip

The view {org_id}_current_entities_final shows only the latest state of active (non-deleted) entities, simplifying many common queries.

2. Workflow Execution Snapshots​

Real-time snapshots of workflow execution data across five tables:

  • {org_id}_workflow_executions_final -- workflow execution attributes.
  • {org_id}_workflow_phases_final -- phase attributes for each execution.
  • {org_id}_workflow_tasks_final -- individual step/task attributes.
  • {org_id}_workflow_cancellations_final -- cancellation reasons.
  • {org_id}_workflow_contexts_final -- related entities for each execution.

See the Workflow Execution API reference for full details.

3. Journey Analytics​

Tracks user sessions and interactions for each journey.

Journey sessions schema:

{
"id": "string", // Unique identifier for the session (UUID)
"org_id": "string", // Unique identifier for the organization associated with the session (UUID)
"journey_id": "string", // Unique identifier for the user journey to which the session belongs (UUID)
"start_time": "datetime", // Timestamp indicating when the session started
"details": "string", // Additional details about the session, in JSON format
"end_time": "datetime", // Timestamp indicating when the session ended
"last_updated_at": "datetime", // Timestamp indicating when was the session last updated at
}

The details field is a JSON string containing browser session metadata: deviceType, osType, browserTypeAndVersion, screenResolution, viewportSize, colorDepth, languagePreference, ip, embeddedIn, and isLauncherJourney.

Journey events schema:

{
"id": "string", // Unique identifier for the event (UUID)
"org_id": "string", // Identifier of the organization associated with the event
"session_id": "string", // Unique identifier for the session in which the event occurred (UUID)
"journey_id": "string", // Unique identifier for the user journey to which the event belongs (UUID)
"type": "string", // Type of event (e.g., "step_navigation", "journey_submit", "journey_exit")
"details": "string", // Additional details about the event, typically in JSON format
"created_at": "datetime" // Timestamp indicating when the event was recorded
}

Event types (type field):

TypeDescription
journey_load_timeTime for the journey to load (API response time).
step_navigationUser navigated between steps.
journey_submitUser completed and submitted the journey.
journey_exitUser closed or refreshed the journey tab.

The details field carries event-specific data: fromStepName/fromStepNumber/toStepName/toStepNumber for navigation events, stepNumber/stepName for exit events, and load duration for load-time events.

Tables:

  • {org_id}_journey_sessions_final -- journey session data.
  • {org_id}_journey_events_final -- journey event data.

Journey analytics db schema

4. Portal Analytics​

Tracks user sessions and interactions for the customer portal and installer portal.

Portal sessions schema:

{
"id": "string", // Unique identifier for the session (UUID)
"org_id": "string", // Identifier of the organization associated with the event
"app_name": "string", // Name of the app for which the session was created (customer portal, installer portal etc.)
"start_time": "datetime", // Timestamp indicating when the session started
"details": "string", // Additional details about the session, in JSON format
"end_time": "datetime", // Timestamp indicating when the session ended
"last_updated_at": "datetime", // Timestamp indicating when was the session last updated at
}

The details field is a JSON string containing: deviceType, osType, browserTypeAndVersion, screenResolution, viewportSize, colorDepth, languagePreference, ip, email, domainName, and referrer.

Portal events schema:

{
"id": "string", // Unique identifier for the event (UUID)
"org_id": "string", // Identifier of the organization associated with the event
"session_id": "string", // Unique identifier for the session in which the event occurred (UUID)
"app_name": "string", // Name of the app for which the event was created (customer portal, installer portal etc.)
"type": "string", // Type of event (e.g., "user_logged_in", "page_navigation", etc.)
"details": "string", // Additional details about the event, in JSON format
"created_at": "datetime" // Timestamp indicating when the event was recorded
}
Portal event types
TypeDescription
user_logged_inUser logged in.
user_registeredNew user registered.
user_logged_outUser logged out.
user_account_deletedUser account deleted.
user_email_changedUser updated email address.
password_changedUser changed password.
page_navigationUser navigated between pages.
add_contract_initiatedUser started adding a contract.
contract_addedContract added.
contract_viewedUser viewed a contract.
contracts_listing_viewedUser viewed contract list.
contract_due_date_changedUser changed contract due date.
contract_payment_rate_changedUser changed payment rate.
order_viewedUser viewed an order.
orders_listing_viewedUser viewed order list.
order_acceptedUser accepted an order.
order_refusedUser declined an order.
opportunity_viewedUser viewed an opportunity.
opportunities_listing_viewedUser viewed opportunity list.
document_uploadedUser uploaded a document.
document_downloadedUser downloaded a document.
document_deletedUser deleted a document.
documents_listing_viewedUser viewed document list.
all_documents_downloadedUser downloaded all documents in a section.
meter_viewedUser viewed a meter.
meters_listing_viewedUser viewed meter list.
meter_widget_viewedUser viewed a meter widget.
meter_reading_submission_initiatedUser started submitting a meter reading.
meter_reading_submittedMeter reading submitted.
journey_openedUser opened a journey.
journey_closedUser closed a journey.
additional_info_update_initiatedUser started updating custom info.
additional_info_updatedCustom info updated.
additional_info_viewedUser viewed custom info.
payments_listing_viewedUser viewed payment list.
product_viewedUser viewed a product.
teaser_openedUser opened a teaser.
external_website_openedUser navigated to an external site.

The details field carries event-specific context: page URLs for navigation, entity details for contracts/orders/meters, external links, and custom information updates.

Tables:

  • {org_id}_portal_sessions_final -- portal session data.
  • {org_id}_portal_events_final -- portal event data.

Portal analytics db schema

Setting Up the Datalake​

Generate credentials​

  1. Go to Settings > Data Lake in the epilot portal. Datalake menu

  2. The Data Lake page lists active credentials. You can revoke existing credentials or create new ones. Datalake page

  3. Click Generate Credentials.

  4. Save the password immediately -- it is shown only once. Datalake Credentials

  5. Use the username, host, port, database, and password to connect from any BI tool or SQL client.

Querying the Data​

Connect to ClickHouse and query your data using either approach:

Direct SQL queries​

Write SQL queries against the ClickHouse tables.

Example 1: Opportunities created over time

SELECT
count(*) AS count,
JSONExtractString(detail, 'payload', 'source', 'title') AS journey_source,
toStartOfYear(timestamp) AS year,
toStartOfMonth(timestamp) AS month
FROM entity_operations
WHERE schema = 'opportunity'
AND operation = 'createEntity'
GROUP BY journey_source, year, month;

Datalake page

Example 2: Journey sessions over time

SELECT journey_id, start_time, details
FROM {org_id}_journey_sessions_final
WHERE journey_id = {your_journey_id}
AND start_time > '2024-08-01 00:00:00';

Datalake page

Example 3: Portal sessions over time

SELECT app_name, start_time, details
FROM {org_id}_portal_sessions_final
WHERE app_name = {your_app_name}
AND start_time > '2024-08-01 00:00:00';

Datalake page

Use any SQL client that supports ClickHouse. See the DataGrip integration guide for a walkthrough.

Connecting to BI tools​

Connect your preferred BI tool directly to ClickHouse, or use the official ClickHouse ODBC driver.

info

The walkthrough below demonstrates connecting Power BI for an energy utility company reporting on Wallbox and Energieausweis sales journeys.

See ClickHouse + Power BI integration for connection options.

Steps to create a Power BI report:

  1. Add data source -- Install the ODBC driver, then add a new ODBC data source in Power BI. Power BI ODBC

  2. Enter connection details -- Provide the ODBC connection string, username, password, and database.

    Driver={ClickHouse ODBC Driver (ANSI)};Server={host};Port=8443;Database=datawarehouse;

    Power BI Connection

  3. Select data -- Browse schemas and tables. Look for entity_operations and the {org_id}_current_entities_final view. Power BI Entity Operations

  4. Load or transform -- Click Load Data for raw import, or Transform to preprocess data in Power BI's query editor. Power BI Current Entities

  5. Model data -- Create relationships between tables (e.g. entity operations and workflow data). Power BI Data Model

  6. Build reports -- Use charts, tables, and visuals to present insights. Power BI Report Power BI Report

Interactive Power BI Report: Entity Operations​

Interactive Power BI Report: Journey Analytics​

See ClickHouse data visualization integrations for connecting other BI tools.