Skip to content

Commit

Permalink
Merge pull request #19 from helios-platform/fix/case-study-image-styling
Browse files Browse the repository at this point in the history
fix: case study image styling
  • Loading branch information
Kuanchiliao1 authored Aug 21, 2024
2 parents 2583417 + a220f2b commit a46af28
Show file tree
Hide file tree
Showing 12 changed files with 34 additions and 25 deletions.
9 changes: 6 additions & 3 deletions docs/.vitepress/config.js
Original file line number Diff line number Diff line change
Expand Up @@ -4,13 +4,16 @@ export default defineConfig({
title: "Helios Case Study",
description: "case study - Helios data stream analysis platform",
cleanUrls: true,
base: '/',
head: [['link', { rel: 'icon', href: '/helios_favicon.ico' }]],
base: "/",
head: [
["link", { rel: "icon", href: "/helios_favicon.ico" }],
["link", { rel: "style", href: "/styles/custom.css" }],
],
appearance: false,
themeConfig: {
siteTitle: "Home",
search: {
provider: 'local'
provider: "local",
},
sidebar: {
"/": [
Expand Down
4 changes: 0 additions & 4 deletions docs/.vitepress/theme/styles/custom.css
Original file line number Diff line number Diff line change
Expand Up @@ -22,10 +22,6 @@
color: #213547;
}

.outline-link {
white-space: wrap !important;
}

.icon-list {
padding-left: 0.5em;
}
Expand Down
5 changes: 3 additions & 2 deletions docs/building-helios.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,9 @@ Of the criteria listed above, ClickHouse's impressive read and write latency par

### Single Node vs Node Cluster

![Node Cluster](/case_study/node_cluster_opt.png)
<p style="
margin: 0;
"><img src="/case_study/node_cluster_opt.png" alt="Node Cluster"></p>

We explored several options when determining the optimal deployment strategy for the Helios production ClickHouse server. While many database deployments utilize clustered architectures for high availability and scalability, with modern implementations often leveraging containerization and orchestration tools like Kubernetes, we found this approach less suitable for ClickHouse.

Expand Down Expand Up @@ -96,4 +98,3 @@ This prep time setting up the execution environment, a cold start, is not charge
![Cold Starts](/case_study/lambdacoldstarts.png)

Ultimately, we decided to stick with the default setup to save users money and have our Lambda function run with cold starts versus implementing a warm Lambda. We believe that the latency impact from this initial setup is of minimal concern as per the nature of event streams; after the first execution, each Lambda execution environment will stay active as long as they are continually invoked.

9 changes: 6 additions & 3 deletions docs/improving-core-platform.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,17 +8,18 @@ Below we detail the problems we encountered and the solutions we implemented to

The initial version of Helios lacked error handling for failed database insertion of event records within the Lambda connector, potentially leading to data loss and difficult to parse error messages. To mitigate these issues and enhance system reliability, we implemented a comprehensive error handling and data quarantine system. Below, we outline the key features of this new system:

<div class="icon-list">
<div class="icon-list" style="margin-top: 8px;">
<p><Icon name="ExclamationCircleIcon"/><span><strong>Error Identification</strong>: The Lambda now includes logic to identify various types of errors, including schema mismatches and insertion failures.</span></p>
<p><Icon name="TagIcon"/><span><strong>Error Categorization</strong>: Each error is then categorized by type and summarized with a concise abstract, providing clear insight into the nature of data quality issues.</span></p>
<p><Icon name="ArchiveBoxIcon"/><span><strong>Data Preservation</strong>: The data that fails to insert into the main table is then stored in a separate quarantine table along with the error summary details. This ensures no data loss and allows users to quickly examine the exact records that failed to insert.</span></p>
<p><Icon name="SparklesIcon"/><span><strong>AI Summary</strong>: For users who provide a ChatGPT AI key during deployment, we've integrated an AI-powered feature to enhance the error analysis process. This feature leverages a custom ChatGPT system prompt to summarize and interpret the errors stored in the quarantine table, significantly aiding users in their debugging efforts.</span></p>
</div>

<br>
<video class="video" width="700" height="400" muted autoplay loop style="border-radius: 5px; box-shadow: 10px 10px 20px rgba(0, 0, 0, 0.5);">
<source src="/case_study/quartable.mp4" type="video/mp4">
</video>

<br>

These enhancements collectively improve Helios’ error handling, while providing tools for error analysis and resolution.

Expand All @@ -40,7 +41,9 @@ These optimizations improve system performance while reducing costs for users, a

Parallelization in the context of Lambdas means that multiple Lambda instances can be run at the same time. By default, the Lambda parallelization factor is set to 1\. This means that only one Lambda instance can be a trigger for one Kinesis <TippyWrapper content="A shard is a unit of capacity within a Kinesis stream that provides a fixed amount of data throughput and serves as a partition for organizing events.">shard</TippyWrapper>. We adjusted this setting to 10, allowing up to ten Lambda instances to process data from a single Kinesis shard simultaneously. This significantly improves our ingestion capacity and scalability, increasing our system's ability to handle high-volume data streams quickly and efficiently.

![Parallelization](/case_study/lambdakinesislimit.png)
<p><img src="/case_study/lambdakinesislimit.png" alt="Parallelization" style="
margin-left: -5%;
"></p>

### Caching DynamoDB requests

Expand Down
16 changes: 11 additions & 5 deletions docs/introduction.md
Original file line number Diff line number Diff line change
Expand Up @@ -24,13 +24,13 @@ An **event** is a state change in a system or application. This could be as simp

In event streaming architecture, events are generated by **producers**, for example a web application which produces user click events, and can be processed by multiple **consumers**. A consumer is an entity that receives and processes messages or events from one or more streams. **Brokers** are the intermediary that receives, stores and distributes events (e.g. queues).

##### Event Streaming Platforms
#### Event Streaming Platforms

**Event streaming platforms** act like a broker in that they receive events from producers and can have multiple consumers. Popular event streaming platforms include Apache Kafka, Google Pub/Sub, and Amazon Kinesis.

The defining characteristics of event streaming are its real-time nature \- where data is processed as it arrives rather than being stored for later analysis, its continuity \- with data being constantly added to the stream, and its unboundedness \- meaning the total size of the data is unknown and potentially infinite.

##### Event streaming use cases
#### Event streaming use cases

Event streaming serves a variety of functions, including:

Expand All @@ -45,7 +45,9 @@ Event streaming serves a variety of functions, including:

While event streaming platforms excel at ingesting and processing high-volume, real-time data, they present a significant challenge for data analysis and exploration: data accessibility. Event streaming platforms are optimized for throughput and real-time processing, not for ad-hoc querying or historical analysis. This makes it difficult for analysts to explore past data or perform complex analyses on the fly.

![Black Box](/case_study/blackbox.png)
<p>
<img src="/case_study/blackbox.png" alt="Black Box" style="max-width: 100%; height: 400px; margin-inline: auto;">
</p>

This limitation can significantly impact a team's ability to derive timely insights from their streaming data. To illustrate this challenge more concretely, let's consider a common use case in the e-commerce industry.

Expand Down Expand Up @@ -107,17 +109,21 @@ At its core, Helios is comprised of:
<p><Icon name="WindowIcon" /><span>Helios web application: offers an interface for connecting existing streams to the Helios backend infrastructure and an integrated SQL console querying and analyzing Kinesis event streams.</span></p>
</div>

![Web app](/case_study/webapp.png)
<br>
<p><img src="/case_study/webapp.png" alt="Web app" style="border-radius: 5px; box-shadow: 10px 10px 20px rgba(0, 0, 0, 0.5);"></p>
<br>

<div class="icon-list">
<p><Icon name="CommandLineIcon" /><span>Helios CLI: configures Helios deployment with AWS credentials; deploys the entire Helios stack to AWS using a single command; and destroys the stack when needed. We will go into more detail within the Automating Deployment section.</span></p>
</div>

<!-- ![CLI](/case_study/cli_dropshadow.png) -->

<video class="video" width="500" height="500" autoplay loop muted style="border-radius: 5px; box-shadow: 10px 10px 20px rgba(0, 0, 0, 0.5);">
<br>
<video class="video" width="500" height="500" autoplay loop muted style="border-radius: 5px; box-shadow: 10px 10px 20px rgba(0, 0, 0, 0.5); margin-inline: auto;">
<source src="/home/helios-cli.mp4" type="video/mp4">
</video>
<br>

As with any tool, the suitability of Helios depends on each team's specific requirements, existing infrastructure, and resources. We encourage potential users to evaluate how our offering aligns with their particular needs and constraints.

Expand Down
16 changes: 8 additions & 8 deletions docs/load-testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,14 @@ We conducted basic load testing on Helios to evaluate its performance under high

### Infrastructure Setup

| EC2 Instance | Lambda Configuration | Kinesis Configuration |
| --------------------------- | ------------------------------------- | ---------------------------- |
| Instance Type: c5.4xlarge | Runtime: Python 3.12 | Streams: 1 stream |
| vCPUs: 16 | Memory: 1024 MB | Shards: 1 shard per stream |
| Memory: 32 GB | Timeout: 15 minutes | |
| Storage: 500gb gp2 | Concurrency: 10 instances per shard | |
| |Batch Size: 100 | |
| |Batch Window: 1 second | |
| EC2 Instance | Lambda Configuration | Kinesis Configuration |
| ------------------------- | ----------------------------------- | -------------------------- |
| Instance Type: c5.4xlarge | Runtime: Python 3.12 | Streams: 1 stream |
| vCPUs: 16 | Memory: 1024 MB | Shards: 1 shard per stream |
| Memory: 32 GB | Timeout: 15 minutes | |
| Storage: 500gb gp2 | Concurrency: 10 instances per shard | |
| | Batch Size: 100 | |
| | Batch Window: 1 second | |

### Data Generation and Ingestion

Expand Down
Binary file modified docs/public/case_study/cli_dropshadow.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/public/case_study/kinesis_integration1.png
100755 → 100644
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/public/case_study/kinesis_integration2.png
100755 → 100644
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/public/case_study/kinesis_to_helios.png
100755 → 100644
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/public/case_study/lambdacoldstarts.png
100755 → 100644
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/public/case_study/tinybird_arch.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit a46af28

Please sign in to comment.