Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How do I parse a CSV file with Duck DB #6

Open
tomlakelive opened this issue Jan 22, 2025 · 1 comment
Open

How do I parse a CSV file with Duck DB #6

tomlakelive opened this issue Jan 22, 2025 · 1 comment

Comments

@tomlakelive
Copy link

Please let me know how I can parse a CSV file for ingestion into DuckDB.

Copy link
Contributor

👋 Hello!

To parse a CSV file and ingest it into DuckDB, you can use the COPY command or the read_csv_auto function. Here’s how you can do it:

Using the COPY Command

The COPY command is a straightforward way to import data from a CSV file into a table in DuckDB.

  1. Create a Table:
    First, you need to create a table with the appropriate schema to match the CSV file.

    CREATE TABLE my_table (
        column1 datatype,
        column2 datatype,
        -- add more columns as needed
    );
  2. Import the CSV File:
    Use the COPY command to import the CSV file into the table.

    COPY my_table FROM 'path/to/your/file.csv' (FORMAT 'csv', HEADER true);
    • FORMAT 'csv' specifies that the file is in CSV format.
    • HEADER true indicates that the first row of the CSV file contains column headers.

Using the read_csv_auto Function

The read_csv_auto function is a convenient way to automatically infer the schema and import the CSV file.

  1. Import the CSV File:
    Use the read_csv_auto function to import the CSV file directly into a table.

    CREATE TABLE my_table AS
    SELECT * FROM read_csv_auto('path/to/your/file.csv');
    • This command will automatically create a table with the inferred schema and import the data from the CSV file.

Example

Suppose you have a CSV file data.csv with the following content:

id,name,age
1,John,30
2,Jane,25
3,Bob,35

You can import this file into DuckDB using either method:

Using COPY Command

CREATE TABLE my_table (
    id INTEGER,
    name VARCHAR,
    age INTEGER
);

COPY my_table FROM 'data.csv' (FORMAT 'csv', HEADER true);

Using read_csv_auto Function

CREATE TABLE my_table AS
SELECT * FROM read_csv_auto('data.csv');

Additional Options

  • Delimiter: If your CSV file uses a different delimiter, you can specify it using the DELIMITER option.

    COPY my_table FROM 'data.csv' (FORMAT 'csv', HEADER true, DELIMITER ';');
  • Quote Character: If your CSV file uses a different quote character, you can specify it using the QUOTE option.

    COPY my_table FROM 'data.csv' (FORMAT 'csv', HEADER true, QUOTE '"');

For more details and options, refer to the COPY command, refer to the documentation on the COPY command: [1].

For more details on the read_csv_auto function, refer to the documentation on the read_csv_auto function: [2].

For more information on working with CSV files in DuckDB, refer to the DuckDB documentation on CSV files: [1][2]

I am a bot powered by Huggingface. Please verify any information provided.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant