Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optional type info for Prompt parameters, or better allow JsonSchema for prompts as well #136

Open
headinthebox opened this issue Jan 9, 2025 · 10 comments
Labels
enhancement New feature or request

Comments

@headinthebox
Copy link

Conceptually, prompts and tools are very similar, i.e., a prompt is a tool that returns a list of messages.
However, the types of Prompts and Tools are unnecessarily different. Why not specify prompt parameters using JsonSchema as well.

If that goes to far, I'd like to be able to specify an inputType for a PromptArgument such that the UI for entering them can reflect their types.

export interface Prompt {
/**

  • The name of the prompt or prompt template.
    /
    name: string;
    /
    *
  • An optional description of what this prompt provides
    /
    description?: string;
    /
    *
  • A list of arguments to use for templating the prompt.
    */
    arguments?: PromptArgument[];
    }

export interface PromptArgument {
/**

  • The name of the argument.
    /
    name: string;
    /
    *
  • A human-readable description of the argument.
    /
    description?: string;
    /
    *
  • Whether this argument must be provided.
    */
    required?: boolean;
    }

export interface Tool {
/**

  • The name of the tool.
    /
    name: string;
    /
    *
  • A human-readable description of the tool.
    /
    description?: string;
    /
    *
  • A JSON Schema object defining the expected parameters for the tool.
    */
    inputSchema: {
    type: "object";
    properties?: { [key: string]: object };
    required?: string[];
    };
    }
@headinthebox headinthebox added the enhancement New feature or request label Jan 9, 2025
@jspahrsummers
Copy link
Member

jspahrsummers commented Jan 10, 2025

We discussed this a bit while designing prompts, and IIRC our main concern was supporting slash command style inputs (/myprompt foo bar), especially in CLIs, which is made harder if prompts accept complex types rather than just strings.

I think it's good if prompts can always support plain strings somehow, but maybe we can allow more structured types over the top if the client and server both support them. Maybe like <img> tag alt text?

@dsp-ant Curious for your thoughts

@headinthebox
Copy link
Author

I'd be OK if you could specify only simple types (like text, number, date, time, boolean, enum)

@dsp-ant
Copy link
Member

dsp-ant commented Jan 15, 2025

I very much understand the sentiment. That the arguments for prompts and tools are different is not very consistent.

My main worry is the added complexity on the client side to implement this. Implementing a full json spec is rather cumbersome and requires a wide variety of UI components potentially to get right. Similarly reducing the valid inputs while still doing some form of json schema would be widely confusing. Notable, any change have an impact on how we treat Completion results

I think we have the following options:

  • Full json spec: This is the most flexible option for arguments and gives a lot of agency over the definition, at the cost of very high complexity on client UI. Require us to rethink how completion will work across resource template parameters and prompts. Some types cannot be mapped to URI
  • Increase fidelity of existing implementation: Allowing additional types, allows for more flexible argument. This is likely BC compatible and likely compatible across uri templates. However the drawback is that it is still a different mechanism from tools. The question here is where to draw the line as I wold assume people would always want one more type.
  • Keep as is: Restrictive but known to work.

My personal preference here would be expanding the types to some subset of scalar types and expand and implemnet a generic Argument type that can be reused in Completion requests.

@headinthebox Curious if you would be willing to spearhead this, otherwise happy to just keep discussing and we can try to do it on our end.

@headinthebox
Copy link
Author

Curious if you would be willing to spearhead this, otherwise happy to just keep discussing and we can try to do it on our end.

Happy to.

@headinthebox
Copy link
Author

headinthebox commented Jan 16, 2025

OK, I have been playing around with flushing out Option 2 (not even looking at completion).
It gets messy rather quickly, and soon because of feature creep it evolves into a subset of JsonSchema.
I think this will always feel like a compromise and make things more complicated instead of simpler.

export interface Prompt {
  name: string;
  description?: string;
  arguments?: PromptArgument[];
}

export interface PromptArgument {
  /**
   * The name of the argument.
   */
  name: string;

  /**
   * A brief description of the argument's purpose.
   */
  description?: string;

  /**
   * Indicates whether the argument is required.
   */
  required?: boolean;

  /**
   * A regular expression to validate the input format.
   * Use this to enforce specific input patterns (e.g., phone numbers, postal codes).
   */
  pattern?: RegExp;

  /**
   * Specifies the format of the input value.
   * This provides semantic meaning, such as "email", "date", or "uuid".
   */
  format?: Format;

  /**
   * A predefined list of allowed values for the argument.
   * The input must match one of the primitive values in this array.
   * Supports types like string, number, boolean, null, etc.
   */
  enum?: Primitive[];

  /**
   * Indicates whether multiple lines of input are allowed.
   * If true, the input field can accept line breaks (e.g., textarea behavior).
   */
  isMultiline?: boolean;

  /**
   * A hint specifying the maximum number of characters to collect.
   */
  maxLength: number;

  /**
   * Indicates whether the input can consist of multiple values separated by a delimiter.
   * If true, the `enum` must define allowed values, and the `delimiter` property can specify
   * the character(s) used to separate values.
   */
  allowMultiple?: boolean;

  /**
   * Specifies the delimiter to use when `allowMultiple` is true.
   * Defaults to `","` if not provided.
   */
  delimiter?: string;
}

/**
 * Defines the supported formats for the `format` property in `PromptArgument`.
 * Includes common types like string, number, date, and more.
 */
export type Format =
  | "string" // General string input
  | "number" // General numeric input
  | "date" // ISO 8601 date (YYYY-MM-DD)
  | "date-time" // ISO 8601 date-time (YYYY-MM-DDTHH:mm:ssZ)
  | "email" // Valid email address
  | "hostname" // Valid hostname (e.g., example.com)
  | "ipv4" // Valid IPv4 address (e.g., 192.168.0.1)
  | "ipv6" // Valid IPv6 address (e.g., 2001:db8::ff00:42:8329)
  | "uri" // Uniform Resource Identifier (e.g., https://example.com)
  | "uuid" // Universally Unique Identifier (e.g., 123e4567-e89b-12d3-a456-426614174000)
  | string; // Custom user-defined formats

/**
 * Represents all primitive types that can be used as values for the `enum` property.
 * Includes string, number, boolean, null, undefined, symbol, and bigint.
 */
type Primitive = string | number | boolean | null | undefined | symbol | bigint;

So it seems that the simpler alternative is to use the same schema as Tools, but either restrict the properties to primitive types (checked at runtime) or allow implementations to ignore non-primitive types.

export interface Prompt {
  name: string;
  description?: string;
  /**
   * A JSON Schema object defining the expected parameters for the prompt
   * with caveats.
   */
  inputSchema: { 
    type: "object";
    properties?: { [key: string]: object }; 
    required?: string[];
  };
}

The simplest solution, or course, is to keep things as is.

@wanderingnature
Copy link

wanderingnature commented Jan 24, 2025

I have been interested in this topic myself and forked mcp to try a few things - you can easily accept arguments that are backward compatible with prompts just by changing the input to dict[str, Any] rather than the more strict dict[str,str]. With dict[str.Any] you are ALMOST the same as tools, meaning you can pass in complex structures but the inputSchema (as with tools) just isn't there directly in the prompt argument for the client to inspect. But it works... I have not investigated what it will take to directly use inputSchema as tools do. A prompt like this works, although it's some custom code on the client to extract the actual input from the description field (why I also include the type information in the description).

from typing import Dict, List, Annotated
from mcp import GetPromptResult
from mcp.server.fastmcp import FastMCP
from mcp.types import TextContent, PromptMessage
from pydantic import BaseModel, Field, EmailStr

# Create server
mcp = FastMCP("ComplexDataServer")

# Model Definitions
class UserProfile(BaseModel):
    """User profile with contact and role information."""
    username: str = Field(..., min_length=1, description="User's unique identifier (str)")
    email: EmailStr = Field(..., description="User's email address (EmailStr)")
    roles: List[str] = Field(..., description="List of assigned roles (List[str])")
    preferences: Dict[str, List[str]] = Field(..., description="User preferences with multiple options (Dict[str, List[str]])")

class MetricsData(BaseModel):
    """Represents performance metrics as matrix data."""
    metric_names: List[str] = Field(..., description="Names of the metrics being tracked (List[str])")
    daily_values: List[List[float]] = Field(..., description="Matrix of daily metric values (List[List[float]])")
    targets: Dict[str, float] = Field(..., description="Target value for each metric (Dict[str, float])")

class TeamStructure(BaseModel):
    """Represents team hierarchy and responsibilities."""
    leads: Dict[str, List[str]] = Field(..., description="Team leads and their direct reports (Dict[str, List[str]])")
    specialties: Dict[str, Dict[str, List[str]]] = Field(..., description="Team specialties and capabilities (Dict[str, Dict[str, List[str]]])")
    locations: List[Dict[str, str]] = Field(..., description="Office locations and details (List[Dict[str, str]])")

class ComplexProjectData(BaseModel):
    """Complex project data combining various nested structures."""
    project_name: str = Field(..., min_length=1, description="Name of the project (str)")
    owner: UserProfile = Field(..., description="Project owner details (UserProfile)")
    metrics: MetricsData = Field(..., description="Project performance metrics (MetricsData)")
    team: TeamStructure = Field(..., description="Team organization details (TeamStructure)")

@mcp.prompt("nested_collections")
def nested_collections_prompt(
    project_name: Annotated[str, Field(description=ComplexProjectData.model_fields["project_name"].description)],
    # Owner fields (UserProfile)
    owner_username: Annotated[str, Field(description=UserProfile.model_fields["username"].description)],
    owner_email: Annotated[EmailStr, Field(description=UserProfile.model_fields["email"].description)],
    owner_roles: Annotated[List[str], Field(description=UserProfile.model_fields["roles"].description)],
    owner_preferences: Annotated[Dict[str, List[str]], Field(description=UserProfile.model_fields["preferences"].description)],
    # Metrics fields (MetricsData)
    metric_names: Annotated[List[str], Field(description=MetricsData.model_fields["metric_names"].description)],
    daily_values: Annotated[List[List[float]], Field(description=MetricsData.model_fields["daily_values"].description)],
    metric_targets: Annotated[Dict[str, float], Field(description=MetricsData.model_fields["targets"].description)],
    # Team fields (TeamStructure)
    team_leads: Annotated[Dict[str, List[str]], Field(description=TeamStructure.model_fields["leads"].description)],
    team_specialties: Annotated[Dict[str, Dict[str, List[str]]], Field(description=TeamStructure.model_fields["specialties"].description)],
    team_locations: Annotated[List[Dict[str, str]], Field(description=TeamStructure.model_fields["locations"].description)]
) -> GetPromptResult:
    """
    Generate a comprehensive project report with nested data structures.
    
    Args:
        project_name (str): Name of the project
        owner_username (str): User's unique identifier
        owner_email (EmailStr): User's email address
        owner_roles (List[str]): List of assigned roles
        owner_preferences (Dict[str, List[str]]): User preferences with multiple options
        metric_names (List[str]): Names of the metrics being tracked
        daily_values (List[List[float]]): Matrix of daily metric values
        metric_targets (Dict[str, float]): Target value for each metric
        team_leads (Dict[str, List[str]]): Team leads and their direct reports
        team_specialties (Dict[str, Dict[str, List[str]]]): Team specialties and capabilities
        team_locations (List[Dict[str, str]]): Office locations and details
    
    Returns:
        GetPromptResult: A result containing the formatted project report.
    """
    # Construct nested objects
    owner = UserProfile(
        username=owner_username,
        email=owner_email,
        roles=owner_roles,
        preferences=owner_preferences
    )
    
    metrics = MetricsData(
        metric_names=metric_names,
        daily_values=daily_values,
        targets=metric_targets
    )
    
    team = TeamStructure(
        leads=team_leads,
        specialties=team_specialties,
        locations=team_locations
    )
    
    input_data = ComplexProjectData(
        project_name=project_name,
        owner=owner,
        metrics=metrics,
        team=team
    )

    # Format the report (simplified for brevity)
    report = f"""
    Project: {input_data.project_name}
    Owner: {input_data.owner.username} ({input_data.owner.email})
    Roles: {', '.join(input_data.owner.roles)}
    Metrics Tracked: {', '.join(input_data.metrics.metric_names)}
    Teams: {', '.join(input_data.team.leads.keys())}
    """
    
    return GetPromptResult(
        description="Complex nested data project report",
        messages=[
            PromptMessage(
                role="user",
                content=TextContent(type="text", text=report),
            ),
        ],
    )

This example client can reconstruct the model class dynamically.

"""Enhanced Smart client for MCP Servers with example model discovery."""
import asyncio
import json
import sys
from pydantic import BaseModel, Field, create_model
from typing import Any, Dict, Type, get_type_hints, List, Optional
import traceback

from mcp import ClientSession, StdioServerParameters
from mcp.client.stdio import stdio_client

def generate_example_input(model: Type[BaseModel]) -> str:
    """Generate example input data based on a Pydantic model's structure."""
    def generate_field_example(field_type: Any, field_name: str) -> Any:
        if field_type == str:
            return f"<string: {field_name}>"
        elif field_type == int:
            return f"<integer: {field_name}>"
        elif field_type == float:
            return f"<float: {field_name}>"
        elif field_type == bool:
            return f"<boolean: {field_name}>"
        elif hasattr(field_type, "__origin__"):
            origin = field_type.__origin__
            if origin == list:
                inner_type = field_type.__args__[0]
                return [generate_field_example(inner_type, f"{field_name} item")]
            elif origin == dict:
                key_type, value_type = field_type.__args__
                return {
                    generate_field_example(key_type, f"{field_name} key"):
                        generate_field_example(value_type, f"{field_name} value")
                }
        return f"<any: {field_name}>"

    example_data = {}
    for field_name, field_info in model.model_fields.items():
        field_type = get_type_hints(model).get(field_name, field_info.annotation)
        example_data[field_name] = generate_field_example(field_type, field_name)

    return json.dumps(example_data, indent=4)

def print_pydantic_model(model: Type[BaseModel]) -> None:
    """Prints a Pydantic model representation."""
    model_name = model.__name__
    print(f"class {model_name}(BaseModel):")
    for field_name, field_info in model.model_fields.items():
        field_type = get_type_hints(model).get(field_name, field_info.annotation)
        field_description = field_info.description or "No description provided"
        print(f"    {field_name}: {field_type} = Field(..., description={field_description!r})")

def extract_type_from_description(description: str) -> Optional[Any]:
    """Extract type information from field description."""
    import re
    match = re.search(r'\((.*?)\)$', description)
    if not match:
        return None

    type_str = match.group(1)
    basic_types = {
        'str': str, 'int': int, 'float': float, 'bool': bool,
        'EmailStr': str
    }

    if type_str in basic_types:
        return basic_types[type_str]

    def parse_nested_type(type_str: str) -> Any:
        list_match = re.match(r'List\[(.*)\]', type_str)
        if list_match:
            inner = list_match.group(1)
            return List[basic_types.get(inner, Any)]

        dict_match = re.match(r'Dict\[(.*),\s*(.*)\]', type_str)
        if dict_match:
            key_type = dict_match.group(1).strip()
            value_type = dict_match.group(2).strip()
            key = basic_types.get(key_type, str)
            return Dict[key, basic_types.get(value_type, Any)]

        return Any

    return parse_nested_type(type_str)

def create_pydantic_model_from_schema(schema: Dict[str, Any], model_name: str = "PromptInputModel") -> Type[BaseModel]:
    """Generate a Pydantic model from a JSON schema."""
    fields = {}
    for argument in schema.get("arguments", []):
        field_name = argument["name"]
        description = argument.get("description", "")
        field_type = extract_type_from_description(description)

        if field_type is None:
            field_type = {
                "string": str, "integer": int, "boolean": bool,
                "number": float, "null": type(None),
                "array": List[Any], "object": Dict[str, Any],
            }.get(argument["type"], Any)

        required = argument.get("required", False)
        fields[field_name] = (field_type, Field(..., description=description) if required
                            else Field(None, description=description))

    return create_model(model_name, **fields)

async def discover_capabilities(session: ClientSession):
    """Discover and print server capabilities."""
    try:
        prompts_response = await session.list_prompts()
        print("Server Prompts:", [p.name for p in prompts_response.prompts])

        for p in prompts_response.prompts:
            print(f"#" * 80)
            print(f"\nPrompt: {p.name}")
            print(f"Description: {p.description}")

            json_schema = p.model_dump_json()
            DynamicModel = create_pydantic_model_from_schema(json.loads(json_schema))

            print("\nModel Class:")
            print_pydantic_model(DynamicModel)

            print("\nExample Input Structure:")
            print(generate_example_input(DynamicModel))

    except Exception as e:
        print(f"Error: {str(e)}")
        print(traceback.format_exc())

async def run_with_server(script_path: str):
    """Run operations within server connection context."""
    params = StdioServerParameters(command="/Users/deepnpisgah/.local/bin/uv",
                                 args=["run", "--with", "mcp", "mcp", "run", script_path])
    async with stdio_client(params) as streams:
        async with ClientSession(streams[0], streams[1]) as session:
            await session.initialize()
            await discover_capabilities(session)

async def main():
    if len(sys.argv) < 2:
        print("Usage: python prompt_client.py <server.py>")
        sys.exit(1)
    await run_with_server(sys.argv[1])

if __name__ == "__main__":
    asyncio.run(main())

Result:

[01/24/25 13:31:55] INFO     Processing request of type            server.py:436
                             ListPromptsRequest                                 
Server Prompts: ['nested_collections']
################################################################################

Prompt: nested_collections
Description: 
    Generate a comprehensive project report with nested data structures.
    
    Args:
        project_name (str): Name of the project
        owner_username (str): User's unique identifier
        owner_email (EmailStr): User's email address
        owner_roles (List[str]): List of assigned roles
        owner_preferences (Dict[str, List[str]]): User preferences with multiple options
        metric_names (List[str]): Names of the metrics being tracked
        daily_values (List[List[float]]): Matrix of daily metric values
        metric_targets (Dict[str, float]): Target value for each metric
        team_leads (Dict[str, List[str]]): Team leads and their direct reports
        team_specialties (Dict[str, Dict[str, List[str]]]): Team specialties and capabilities
        team_locations (List[Dict[str, str]]): Office locations and details
    
    Returns:
        GetPromptResult: A result containing the formatted project report.
    

Model Class:
class PromptInputModel(BaseModel):
    project_name: <class 'str'> = Field(..., description='Name of the project (str)')
    owner_username: <class 'str'> = Field(..., description="User's unique identifier (str)")
    owner_email: <class 'str'> = Field(..., description="User's email address (EmailStr)")
    owner_roles: typing.List[str] = Field(..., description='List of assigned roles (List[str])')
    owner_preferences: typing.Dict[str, typing.Any] = Field(..., description='User preferences with multiple options (Dict[str, List[str]])')
    metric_names: typing.List[str] = Field(..., description='Names of the metrics being tracked (List[str])')
    daily_values: typing.List[typing.Any] = Field(..., description='Matrix of daily metric values (List[List[float]])')
    metric_targets: typing.Dict[str, float] = Field(..., description='Target value for each metric (Dict[str, float])')
    team_leads: typing.Dict[str, typing.Any] = Field(..., description='Team leads and their direct reports (Dict[str, List[str]])')
    team_specialties: typing.Dict[str, typing.Any] = Field(..., description='Team specialties and capabilities (Dict[str, Dict[str, List[str]]])')
    team_locations: typing.List[typing.Any] = Field(..., description='Office locations and details (List[Dict[str, str]])')

Example Input Structure:
{
    "project_name": "<string: project_name>",
    "owner_username": "<string: owner_username>",
    "owner_email": "<string: owner_email>",
    "owner_roles": [
        "<string: owner_roles item>"
    ],
    "owner_preferences": {
        "<string: owner_preferences key>": "<any: owner_preferences value>"
    },
    "metric_names": [
        "<string: metric_names item>"
    ],
    "daily_values": [
        "<any: daily_values item>"
    ],
    "metric_targets": {
        "<string: metric_targets key>": "<float: metric_targets value>"
    },
    "team_leads": {
        "<string: team_leads key>": "<any: team_leads value>"
    },
    "team_specialties": {
        "<string: team_specialties key>": "<any: team_specialties value>"
    },
    "team_locations": [
        "<any: team_locations item>"
    ]
}

Process finished with exit code 0

@wanderingnature
Copy link

I wasn't ready to make a pull request for my fork yet, so I just made a public repo that you can review if you like. https://github.com/wanderingnature/mcp-typed-prompts.git

@jspahrsummers
Copy link
Member

So it seems that the simpler alternative is to use the same schema as Tools, but either restrict the properties to primitive types (checked at runtime) or allow implementations to ignore non-primitive types.

@headinthebox Using JSON Schema but allowing clients to ignore non-primitive types makes sense to me. And perhaps we prevent it in the SDKs for servers, but don't have to disallow it at the spec level?

This sounds like a pretty good path forward to me.

@jspahrsummers
Copy link
Member

@wanderingnature It seems you're mostly modifying the SDK, but that won't be sufficient, as this requires spec-level changes and affects all MCP SDKs.

@wanderingnature
Copy link

I am certainly modifying the Python SDK, and doing so by changing the input to dict[str, Any] rather than the more strict (overly strict IMHO) dict[str,str] in the specification. Changing the specification to be dict[str,Any] does not break anything in other SDKs if the client pass dict[str,str] - only if they pass dict[str,Any] and the server is built on an SDK that isn't current. That is a reason for the SDKs to be updated to match the current specification - and any server that is intended to be used in the long term should always be built against the current SDK. And I think this is a really small change. For the Python SDK this is just a few lines of code and I suspect that it is similarly easy to update other SDKs. Only changing the args in the specification is a pretty small change compared to matching the Tools interface and using JSON schema - and it gets you quite a long way towards the goal of type info for Prompts in a pretty easy way. I also think that making Prompts match Tools is overkill because they have fundamentally different use-cases. I get that Prompts are intended to be templated and thus use much simpler args. But allowing Any does open up some nice capability without introducing (IMHO) any significant overhead.

dict[str,Any] is entirely backward compatible and is easily forward compatible if servers are updated to use the latest SDK based on the latest schema.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants