Text2CAD: An AI-powered CAD/CAM assistant that integrates advanced Visual-Spatial Intelligence (VSI)

Project Overview

The project addresses the specific challenges of CAD/CAM tasks, focusing on text-3D interpretation and innovative solution generation. Our goal is to develop a system capable of:

Interpreting design specifications (textual or parametric)
Performing spatial reasoning and 3D visualization
Suggesting innovative design solutions

Data Acquisition & Preprocessing

In practice, your data might come from:

Existing 3D CAD models (e.g., .STEP, .STL, .OBJ, etc.)
Synthetic data generation
Text-based design specs or parametric definitions

For demonstration, let's assume we've collected a small set of 3D parts and their textual descriptions. We'll show how you might load them and perform basic preprocessing.

# For data manipulation
import os
import numpy as np
import pandas as pd

# For geometry processing
import open3d as o3d
import trimesh

# For deep learning & LLMs
import torch

from transformers import AutoModel, AutoTokenizer

# We'll simulate having a list of 3D object files
cad_files = [
    "part1.obj", 
    "part2.obj", 
    "assembly1.obj"
]

# Placeholder: if these files don't exist, skip loading
loaded_meshes = {}
for f in cad_files:
    if os.path.exists(f):
        mesh = trimesh.load(f)
        loaded_meshes[f] = mesh
        print(f"Loaded {f} with {len(mesh.vertices)} vertices and {len(mesh.faces)} faces.")
    else:
        print(f"File not found: {f} - skipping...")

# Example text-based design specs
text_specs = {
    "part1.obj": "A mechanical bracket with holes at each corner for screws...",
    "part2.obj": "A gear-like part with radial spokes, 100 mm outer diameter...",
    "assembly1.obj": "An assembly combining part1 and part2 with specific constraints..."
}

print("\nData acquisition & preprocessing complete.")

Building a Baseline 3D Representation System

For advanced spatial reasoning, we need 3D representations that are machine-readable. We can use point clouds, meshes, or parametric shapes depending on the tasks. Let's demonstrate a simple geometric operation:

def compute_bounding_box(mesh: trimesh.Trimesh):
    bbox = mesh.bounding_box
    return bbox.primitive.extents  # (length, width, height)

# Example bounding box calculation

for filename, mesh in loaded_meshes.items():
    bbox_dims = compute_bounding_box(mesh)
    print(f"{filename} bounding box (L, W, H): {bbox_dims}")
    
print("\nBaseline 3D representation system established.")

Incorporating Large Language Models (LLMs)

We want to combine language-based instructions/feedback with 3D geometry. We'll use a placeholder LLM (like GPT-2 or any open-source large model) to parse text specs and generate suggestions.

model_name = "DeepSeek-V3"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)

def generate_text_prompt(prompt: str, max_length=50):
    # A simplistic placeholder for LLM text generation
    inputs = tokenizer(prompt, return_tensors="pt")
    outputs = model(**inputs)
    # We'll just return the hidden state shape for demonstration
    return outputs.last_hidden_state.shape

# Example usage
prompt_example = "Suggest possible design constraints for a gear with 100 mm outer diameter"
response_shape = generate_text_prompt(prompt_example)
print(f"LLM response (placeholder shape): {response_shape}")
print("\nLLM component initialized.")

Integrating VSI with 3D Geometry

Visual-Spatial Intelligence (VSI) involves tasks like:

Spatial layout optimization
Constraint satisfaction
3D manipulation and transformation

We'll set up a placeholder function that "analyzes" the 3D object and returns recommended transformations based on textual instructions.

def spatial_reasoning_module(mesh: trimesh.Trimesh, instruction: str):
    """
    Placeholder for a module that uses geometry + LLM-based reasoning to 
    propose transformations or checks on the mesh.
    """
    # For demonstration, we'll do a simple check:
    # if "reduce weight" in instruction -> scale model down
    recommended_transformation = None
    if "reduce weight" in instruction.lower():
        # Suggest a scale factor of 0.9
        recommended_transformation = "Scale mesh by 0.9"
    elif "enlarge" in instruction.lower():
        recommended_transformation = "Scale mesh by 1.1"
    else:
        recommended_transformation = "No transformation recommended"
    
    return recommended_transformation

# Example usage
for filename, mesh in loaded_meshes.items():
    instruction = "We want to reduce weight by optimizing geometry."
    transform = spatial_reasoning_module(mesh, instruction)
    print(f"For {filename}: {transform}")
    
print("\nIntegrated VSI placeholder module executed.")

Iterative Design Suggestions & Spatial Reasoning

Here, we demonstrate how one might loop through multiple rounds of instructions, combining text-based feedback and geometry updates to converge on a design solution.

design_instructions = [
    "Reduce the thickness of outer edges.",
    "Add holes to minimize weight.",
    "Ensure the part can withstand 200 N load on the bracket."
]

def iterative_design_process(mesh: trimesh.Trimesh, instructions: list):
    for idx, instr in enumerate(instructions):
        # Use text interpretation
        text_info = generate_text_prompt(instr)
        # Use spatial reasoning
        transform_suggestion = spatial_reasoning_module(mesh, instr)
        # Log the decisions (in real usage, you'd apply the transform)
        print(f"Iteration {idx+1}: {instr}")
        print(f" - LLM interpretation shape: {text_info}")
        print(f" - Spatial transform suggestion: {transform_suggestion}")
        print("--------------------------------------------------")

for filename, mesh in loaded_meshes.items():
    print(f"=== Iterative Design Process for {filename} ===")
    iterative_design_process(mesh, design_instructions)
    print()

print("\nIterative design suggestions demonstrated.")

Evaluation & Metrics

We can evaluate our AI assistant in multiple ways:

Geometric metrics: volume, surface area, bounding box, collision checks
Performance metrics: time to generate suggestions, solution quality
User feedback: subjective satisfaction or trust in the AI suggestions
Prototype tasks: e.g., can the AI propose valid constraints?

Below, we show a placeholder for some geometric metrics.

def evaluate_mesh(mesh: trimesh.Trimesh):
    # Compute volume, surface area
    vol = mesh.volume
    area = mesh.area
    return vol, area

for filename, mesh in loaded_meshes.items():
    vol, area = evaluate_mesh(mesh)
    print(f"{filename}: volume={vol:.2f}, surface_area={area:.2f}")

print("\nEvaluation placeholders complete.")

Future Directions & Next Steps

Advanced 3D Feature Extraction:
- Integrate libraries that handle parametric definitions and advanced geometry (e.g., FreeCAD, Onshape, or commercial APIs).
Fine-Tuning or Instruction-Tuning LLMs:
- Explore instruction-tuned large language models that better handle domain-specific vocabulary and design language.
Multimodal Fusion:
- Create a more direct pipeline that fuses 3D geometry embeddings with language embeddings. For instance, use vision transformers adapted to 3D or point cloud networks.
Human-in-the-Loop Approaches:
- Provide interactive UIs in Jupyter or specialized CAD frontends for real-time user feedback and incremental design modifications.
Trust & Explainability:
- Develop methods for explaining the AI’s suggestions, highlighting critical geometry changes or constraints.
Real-time Collaboration:
- Eventually integrate these capabilities into real design environments, enabling multiple engineers to interact with the AI system simultaneously.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Text2CAD: An AI-powered CAD/CAM assistant that integrates advanced Visual-Spatial Intelligence (VSI)

Project Overview

Data Acquisition & Preprocessing

Building a Baseline 3D Representation System

Incorporating Large Language Models (LLMs)

Integrating VSI with 3D Geometry

Iterative Design Suggestions & Spatial Reasoning

Evaluation & Metrics

Future Directions & Next Steps

About

Releases

Packages

zhengqinghuang2023/Text2CAD

Folders and files

Latest commit

History

Repository files navigation

Text2CAD: An AI-powered CAD/CAM assistant that integrates advanced Visual-Spatial Intelligence (VSI)

Project Overview

Data Acquisition & Preprocessing

Building a Baseline 3D Representation System

Incorporating Large Language Models (LLMs)

Integrating VSI with 3D Geometry

Iterative Design Suggestions & Spatial Reasoning

Evaluation & Metrics

Future Directions & Next Steps

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Packages