Skip to content

Commit

Permalink
📦 NEW: Parse primitive support
Browse files Browse the repository at this point in the history
  • Loading branch information
msaaddev committed Feb 4, 2025
1 parent 0842c45 commit 6fef769
Show file tree
Hide file tree
Showing 6 changed files with 246 additions and 47 deletions.
99 changes: 99 additions & 0 deletions examples/nodejs/examples/parse/composable-ai.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
# Composable AI

## The Developer Friendly Future of AI Infrastructure

In software engineering, composition is a powerful concept. It allows for building complex systems from simple, interchangeable parts. Think Legos, Docker containers, React components. Langbase extends this concept to AI infrastructure with our **Composable AI** stack using [Pipes][pipe] and [Memory][memory].

---

## Why Composable AI?

**Composable and personalized AI**: With Langbase, you can compose multiple models together into pipelines. It's easier to think about, easier to develop for, and each pipe lets you choose which model to use for each task. You can see cost of every step. And allow your customers to hyper-personalize.

**Effortlessly zero-config AI infra**: Maybe you want to use a smaller, domain-specific model for one task, and a larger general-purpose model for another task. Langbase makes it easy to use the right primitives and tools for each part of the job and provides developers with a zero-config composable AI infrastructure.

That's a nice way of saying, *you get a unicorn-scale API in minutes, not months*.

> **The most common problem** I hear about in Gen AI space is that my AI agents are too complex and I can't scale them, too much AI talking to AI. I don't have control, I don't understand the cost, and the impact of this change vs that. Time from new model to prod is too long. Feels static, my customers can't personalize it. ⌘ Langbase fixes all this. — [AA](https://www.linkedin.com/in/MrAhmadAwais/)
---

## Interactive Example: Composable AI Email Agent

But how does Composable AI work?

Here's an interactive example of a composable AI Email Agent: Classifies, summarizes, responds. Click to send a spam or valid email and check how composable it is: Swap any pipes, any LLM, hyper-personalize (you or your users), observe costs. Everything is composable.



## Example: Composable AI Email Agent


I have built an AI email agent that can read my emails, understand the sentiment, summarize, and respond to them. Let's break it down to how it works, hint several pipes working together to make smart personalized decisions.

1. I created a pipe: `email-sentiment` — this one reads my emails to understand the sentiment
2. `email-summarizer` pipe — it summarizes my emails so I can quickly understand them
3. `email-decision-maker` pipe — should I respond? is it urgent? is it a newsletter?
4. If `email-decision-maker` pipe says *yes*, then I need to respond. This invokes the final pipe
5. `email-writer` pipe — writes a draft response to my emails with one of the eight formats I have


## Why Composable AI is powerful?

Ah, the power of composition. I can swap out any of these pipes with a new one.

- **Flexibility**: Swap components without rewriting everything
- **Reusability**: Build complex systems from simple, tested parts
- **Scalability**: Optimize at the component level for better performance
- **Observability**: Monitor and debug each step of your AI pipeline


### Control flow

- Maybe I want to use a different sentiment analysis model
- Or maybe I want to use a different summarizer when I'm on vacation
- I can chose a different LLM (small or large) based on the task
- BTW I definitely use a different `decision-maker` pipe on a busy day.

### Extensibility

- **Add more when needed**: I can also add more pipes to this pipeline. Maybe I want to add a pipe that checks my calendar or the weather before I respond to an email. You get the idea. Always bet on composition.
- **Eight Formats to write emails**: And I have several formats. Because Pipes are composable, I have eight different versions of `email-writer` pipe. I have a pipe `email-pick-writer` that picks the correct pipe to draft a response with. Why? I talk to my friends differently than my investors, reports, managers, vendors — you name it.


### Long-term memory and context awareness

- By the way, I have all my emails in an `emails-store` memory, which any of these pipes can refer to if needed. That's managed [semantic RAG][memory] over all the emails I have ever received.
- And yes, my `emails-smart-spam` memory knows all the pesky smart spam emails that I don't want to see in my inbox.

### Cost & Observability

- Because each intent and action is mapped out Pipe — which is an excellent primitive for using LLMs, I can see everything related to cost, usage, and effectiveness of each pipe. I can see how many emails were processed, how many were responded to, how many were marked as spam, etc.
- I can switch LLMs for any of these actions, [fork a pipe][fork], and see how it performs. I can version my pipes and see how the new version performs against the old one.
- And we're just getting started …

### Why Developers Love It

- **Modular**: Build, test, and deploy pipes x memorysets independently
- **Extensible**: API-first no dependency on a single language
- **Version Control Friendly**: Track changes at the pipe level
- **Cost-Effective**: Optimize resource usage for each AI task
- **Stakeholder Friendly**: Collaborate with your team on each pipe and memory. All your R&D team, engineering, product, GTM (marketing, sales), and even stakeholders can collaborate on the same pipe. It's like a Google Doc x GitHub for AI. That's what makes it so powerful.

---

Each pipe and memory are like a docker container. You can have any number of pipes and memorysets.

Can't wait to share more exciting examples of composable AI. We're cookin!!

We'll share more on this soon. Follow us on [Twitter][x] and [LinkedIn][li] for updates.

[pipe]: /pipe/
[memory]: /memory
[signup]: https://langbase.fyi/awesome
[x]: https://twitter.com/LangbaseInc
[li]: https://www.linkedin.com/company/langbase/
[email]: mailto:[email protected]?subject=Pipe-Quickstart&body=Ref:%20https://langbase.com/docs/pipe/quickstart
[fork]: https://langbase.com/docs/features/fork

---
27 changes: 27 additions & 0 deletions examples/nodejs/examples/parse/index.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
import 'dotenv/config';
import {Langbase} from 'langbase';
import fs from 'fs';
import path from 'path';

const langbase = new Langbase({
apiKey: process.env.LANGBASE_API_KEY!,
});

async function main() {
const documentPath = path.join(
process.cwd(),
'examples',
'parse',
'composable-ai.md',
);

const results = await langbase.parse({
document: fs.readFileSync(documentPath),
documentName: 'composable-ai.md',
contentType: 'application/pdf',
});

console.log(results);
}

main();
3 changes: 2 additions & 1 deletion examples/nodejs/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,8 @@
"tools.web-search": "npx tsx ./examples/tools/web-search.ts",
"tools.crawl": "npx tsx ./examples/tools/crawl.ts",
"embed": "npx tsx ./examples/embed/index.ts",
"chunk": "npx tsx ./examples/chunk/index.ts"
"chunk": "npx tsx ./examples/chunk/index.ts",
"parse": "npx tsx ./examples/parse/index.ts"
},
"keywords": [],
"author": "Ahmad Awais <[email protected]> (https://twitter.com/MrAhmadAwais)",
Expand Down
3 changes: 3 additions & 0 deletions examples/nodejs/readme.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,4 +40,7 @@ npm run chunk

# embed
npm run embed

# parse
npm run parse
```
106 changes: 60 additions & 46 deletions packages/langbase/src/langbase/langbase.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import {convertDocToFormData} from '@/lib/utils/doc-to-formdata';
import {Request} from '../common/request';

export type Role = 'user' | 'assistant' | 'system' | 'tool';
Expand Down Expand Up @@ -191,6 +192,14 @@ export type EmbeddingModels =
| 'cohere:embed-multilingual-light-v3.0'
| 'google:text-embedding-004';

export type ContentType =
| 'application/pdf'
| 'text/plain'
| 'text/markdown'
| 'text/csv'
| 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet'
| 'application/vnd.ms-excel';

export interface MemoryCreateOptions {
name: string;
description?: string;
Expand Down Expand Up @@ -223,13 +232,7 @@ export interface MemoryUploadDocOptions {
documentName: string;
meta?: Record<string, string>;
document: Buffer | File | FormData | ReadableStream;
contentType:
| 'application/pdf'
| 'text/plain'
| 'text/markdown'
| 'text/csv'
| 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet'
| 'application/vnd.ms-excel';
contentType: ContentType;
}

export interface MemoryRetryDocEmbedOptions {
Expand Down Expand Up @@ -263,13 +266,7 @@ export interface MemoryListDocResponse {
status_message: string | null;
metadata: {
size: number;
type:
| 'application/pdf'
| 'text/plain'
| 'text/markdown'
| 'text/csv'
| 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet'
| 'application/vnd.ms-excel';
type: ContentType;
};
enabled: boolean;
chunk_size: number;
Expand Down Expand Up @@ -316,20 +313,25 @@ export type EmbedResponse = number[][];
export interface ChunkOptions {
document: Buffer | File | FormData | ReadableStream;
documentName: string;
contentType:
| 'application/pdf'
| 'text/plain'
| 'text/markdown'
| 'text/csv'
| 'application/vnd.openxmlformats-officedocument.spreadsheetml.sheet'
| 'application/vnd.ms-excel';
contentType: ContentType;
chunkMaxLength?: string;
chunkOverlap?: string;
separator?: string;
}

export type ChunkResponse = string[];

export type ParseOptions = {
document: Buffer | File | FormData | ReadableStream;
documentName: string;
contentType: ContentType;
};

export type ParseResponse = {
documentName: string;
content: string;
};

export class Langbase {
private request: Request;
private apiKey: string;
Expand Down Expand Up @@ -375,6 +377,7 @@ export class Langbase {

public embed: (options: EmbedOptions) => Promise<EmbedResponse>;
public chunk: (options: ChunkOptions) => Promise<ChunkResponse>;
public parse: (options: ParseOptions) => Promise<ParseResponse>;

constructor(options?: LangbaseOptions) {
this.baseUrl = options?.baseUrl ?? 'https://api.langbase.com';
Expand Down Expand Up @@ -415,6 +418,7 @@ export class Langbase {

this.embed = this.generateEmbeddings.bind(this);
this.chunk = this.chunkDocument.bind(this);
this.parse = this.parseDocument.bind(this);
}

private async runPipe(
Expand Down Expand Up @@ -714,32 +718,12 @@ export class Langbase {
* @returns A promise that resolves to the chunked document response.
*/
private async chunkDocument(options: ChunkOptions): Promise<ChunkResponse> {
let formData = new FormData();

if (options.document instanceof Buffer) {
const documentBlob = new Blob([options.document], {
type: options.contentType,
});
formData.append('document', documentBlob, options.documentName);
} else if (options.document instanceof File) {
formData.append('document', options.document, options.documentName);
} else if (options.document instanceof FormData) {
formData = options.document;
} else if (options.document instanceof ReadableStream) {
const chunks: Uint8Array[] = [];
const reader = options.document.getReader();

while (true) {
const {done, value} = await reader.read();
if (done) break;
chunks.push(value);
}

const documentBlob = new Blob(chunks, {type: options.contentType});
formData.append('document', documentBlob, options.documentName);
}
const formData = await convertDocToFormData({
document: options.document,
documentName: options.documentName,
contentType: options.contentType,
});

formData.append('documentName', options.documentName);
if (options.chunkMaxLength)
formData.append('chunkMaxLength', options.chunkMaxLength);
if (options.chunkOverlap)
Expand All @@ -756,4 +740,34 @@ export class Langbase {

return response.json();
}

/**
* Parses a document using the Langbase API.
*
* @param options - The options for parsing the document
* @param options.document - The document to be parsed
* @param options.documentName - The name of the document
* @param options.contentType - The content type of the document
*
* @returns A promise that resolves to the parse response from the API
*
* @throws {Error} If the API request fails
*/
private async parseDocument(options: ParseOptions): Promise<ParseResponse> {
const formData = await convertDocToFormData({
document: options.document,
documentName: options.documentName,
contentType: options.contentType,
});

const response = await fetch(`${this.baseUrl}/v1/parse`, {
method: 'POST',
headers: {
Authorization: `Bearer ${this.apiKey}`,
},
body: formData,
});

return response.json();
}
}
55 changes: 55 additions & 0 deletions packages/langbase/src/lib/utils/doc-to-formdata.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@

/**
* Converts various document formats to FormData.
*
* @param options - The conversion options
* @param options.document - The document to convert. Can be Buffer, File, FormData or ReadableStream
* @param options.documentName - The name of the document
* @param options.contentType - The MIME type of the document
*
* @returns A Promise that resolves to FormData containing the document
*
* @example
* ```ts
* const buffer = Buffer.from('Hello World');
* const formData = await convertDocToFormData({
* document: buffer,
* documentName: 'hello.txt',
* contentType: 'text/plain'
* });
* ```
*/
export async function convertDocToFormData(options: {
document: Buffer | File | FormData | ReadableStream;
documentName: string;
contentType: string;
}) {
let formData = new FormData();

if (options.document instanceof Buffer) {
const documentBlob = new Blob([options.document], {
type: options.contentType,
});
formData.append('document', documentBlob, options.documentName);
} else if (options.document instanceof File) {
formData.append('document', options.document, options.documentName);
} else if (options.document instanceof FormData) {
formData = options.document;
} else if (options.document instanceof ReadableStream) {
const chunks: Uint8Array[] = [];
const reader = options.document.getReader();

while (true) {
const {done, value} = await reader.read();
if (done) break;
chunks.push(value);
}

const documentBlob = new Blob(chunks, {type: options.contentType});
formData.append('document', documentBlob, options.documentName);
}

formData.append('documentName', options.documentName);

return formData;
}

0 comments on commit 6fef769

Please sign in to comment.