Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hosted Project Proposal: language-specific wasi:http samples #113

Open
yoshuawuyts opened this issue Nov 20, 2024 · 12 comments
Open

Hosted Project Proposal: language-specific wasi:http samples #113

yoshuawuyts opened this issue Nov 20, 2024 · 12 comments

Comments

@yoshuawuyts
Copy link
Member

yoshuawuyts commented Nov 20, 2024

Proposing the adoption of wasi:http samples as a BA hosted project, starting
with a sample for Rust.

Repository URL: https://github.com/yoshuawuyts/wasi-rust-sample

One of the main use cases for WASI today is for writing HTTP handlers. The
promise is that these handlers can be language-specific, use language
best-practices, all while still targeting the portable and language agnostic
WASI platform. In order to help guide users get started with WASI using their
programming language of choice, we want to give them all the tools necessary not
just to get started - but to succeed.

This hosted project proposes the addition of wasi:http-specific samples to the
BA's GitHub org. The structure for this is a single repository per programming
language, as that makes it possible to be cloned directly as a GitHub Template,
modified using GitHub CodeSpaces, and publish to GitHub Artifacts. This
significantly lowers the bar to get to production-grade deployments.

The first sample will be authored in Rust, as tooling for that is far along
already. But our expectation is to author additional samples for other
programming languages in due time as well.

Requirements

Alignment with the Bytecode Alliance Mission

Projects must have alignment with the Bytecode Alliance mission:

Our mission is to provide state-of-the-art foundations to develop runtime environments and language toolchains where security, efficiency, and modularity can all coexist across a wide range of devices and architectures. We enable innovation in compilers, runtimes, and tooling, focusing on fine-grained sandboxing, capabilities-based security, modularity, and standards such as WebAssembly and WASI.

The Bytecode Alliance is a group with a specific mission, and we therefore will only sponsor projects that are in alignment with and further that mission. For example, project sponsorship is untenable if the project undermines sandboxing, security, or standardization efforts.

This sample supports these goals.

Code Review

Description

All projects must gate merging pull requests on code reviews that audit not only for style but also substance, such as whether security invariants are properly maintained by the new code.

It is recommended, but not required, that hosted projects maintain a CODEOWNERS file and automatically assign reviewers as well.

Code reviews have a demonstrable impact on the quality of source code by catching bugs early, determining the best possible implementation, and fostering trust within the community. Timely responses let contributors know that their work is valued and encourages further contribution.

We include a CODEOWNERS file and will respond to issues, PRs, and other user in line with the requirements stated here

Code of Conduct

All Bytecode Alliance projects must:

  • link to the Bytecode Alliance's Code of Conduct documents from a CODE_OF_CONDUCT.md file in root of the repository, and
  • enforce the codes of conduct among the community and contributors, or escalate to the Bytecode Alliance CoC Team, if needed.

Having a code of conduct is crucial for creating a positive and respectful environment in any organization, community, or group. It serves as a set of guidelines that outline expected behavior and ethical standards for all members involved.

We have adopted the BA CoC.

Continuous Integration Testing

All projects must run continuous integration (CI) tests on all pull requests and merges. Key project features must be covered by CI.

If any part of the CI gates on merging changes that is not reproducible by external contributors, then the project must make affordances to support those external contributors.

Implementing CI offers several benefits to software projects, helping ensure correctness and quality, making it an essential practice for modern software development.

We have CI setup, though we intend to improve if further. Part of the reason why we're upstreaming this sample is to collaborate and improve the standard Component flows - and that includes testing too. The fact that this takes some effort to do correctly is exactly what we're hoping to improve.

Contributor Documentation

All projects must have a CONTRIBUTING.md document in the root of their repository. This document must provide, or link to another form of project-specific documentation that provides, high-quality contributor documentation.

See "How to build a CONTRIBUTING.md" by the Mozilla Science Lab for more details on what a high-quality CONTRIBUTING.md file looks like.

A CONTRIBUTING.md serves as a guide for potential contributors, outlining the expectations for individuals who wish to contribute to the project. The Bytecode Alliance is a community-driven software foundation and documents like CONTRIBUTING.md are necessary for fostering community contributions.

We include a CONTRIBUTING.md.

Following the Bytecode Alliance Operational Principles

All projects must follow the Bytecode Alliance Operational Principles.

In pursuing our mission and vision, the Bytecode Alliance follows a set of operational principles aimed at keeping us aligned on three key aspects: what we want to create, how we want to work together, and how we want to work with others.

We follow the operating principles. Our intent with this sample is to establish a first language-specific sample, that can be replicated by other languages / interfaces too.

Licensing Compatible with the Bytecode Alliance

All projects must be licensed under the Apache 2.0 license with an LLVM exception. Exemptions may be granted by the board.

All projects must only use dependencies and third-party code licensed under one of the following open source licenses:

  • Apache-2.0 WITH LLVM-exception
  • Apache-2.0
  • BSD-2-Clause
  • BSD-3-Clause
  • ISC
  • MIT
  • MPL-2.0
  • OpenSSL
  • Unicode-DFS-2016
  • Zlib

All dependencies and third-party code must be properly attributed.

The source for all projects must be available to all members and must be available to all non-members under the same license.

All projects must automatically ensure that licensing requirements of dependencies are met in CI.

We strive to build an open community and a legally-compatible software ecosystem.

We've adopted the Apache-2.0 license with LLVM exception.

README

All hosted projects must have a README.md file in the root of the repository which begins with:

  • The project name and logo (if one exists)
  • A one-sentence description of the project
  • <strong>A <a href="https://bytecodealliance.org/">Bytecode Alliance</a> hosted project</strong>

The most important information about the project should be "above the fold". Projects should identify themselves as Bytecode Alliance projects so that, with time, people associate the Bytecode Alliance with quality projects that they can rely on.

We meet these requirements.

Release Process

Documentation of a release process that any project maintainer may execute to create a new release version of the software.

Multiple people must have permissions to publish releases. A github team must have access to publish packages and package ownership on the associated package repository when possible. For example a Rust project may have multiple owners on crates.io.

Projects and their releases shouldn't be tied to any single user's machine or keys to ensure continuity of the project. A project isn't an open, community project if only one person can publish releases.

Automation makes fewer mistakes than humans, and getting releases right is critical, since only releases are typically used downstream, not random commits from main.

Our sample includes automated releases - uploading components to registries is an important part of the core flow.

Security Process

All projects must have a documented security process for reporting and disclosing vulnerabilities, managing patches that fix vulnerabilities, and announcing and making available security releases. Furthermore, projects must actually follow their documented processes.

It is recommended that request Common Vulnerability and Exposure (CVE) numbers for discovered vulnerabilities and report the CVE when disclosing the vulnerability.

A tool like dependabot may suffice for hosted projects. Dependabot should be used for security updates only, and not apply all updates indiscriminantly. Updating dependencies should otherwise be done with intention (never automatically). Automatic creation of pull requests is acceptable, but manual review is required to prevent supply chain attacks.

Bytecode Alliance projects must be a secure foundation for others to build upon. Transparency and a managed security release process is key to being this foundation.

We have configured dependabot

Semantic Versioning

All projects must follow either standard semantic versioning or their ecosystem's local-dialect of semantic versioning (for example, Rust and cargo's interpretation of semantic versioning slightly differs from the standard, but is acceptable for Rust Bytecode Alliance projects).

A clear versioning scheme is necessary for end-users. We desire consistency across projects and so the Bytecode Alliance has adopted semantic versioning as a required best practice.

We follow semantic versioning.

Secrets Management

GitHub organization and repository level secrets should be used. Secrets must not be hard coded in source.

For secrets like passwords for the project's associated social media account, these should be stored in the password service paid for by the Bytecode Alliance. Contact the TSC for access and ability to manage a given secret.

Secure secret management is a requirement for a secure project. Additionally, projects and their associated accounts shouldn't be tied to any single user's machine or keys to ensure continuity of the project. A project isn't an open, community project if only one person can access its accounts.

We do not manage any secrets nor do we plan to.

Supply Chain Security

All projects must follow a well-documented process for updating dependencies and auditing them for malicious supply-chain attacks.

When applicable, projects should:

  • Integrate auditing tools in CI (such as cargo vet)
  • Use code review and static analysis tools on dependencies

Finally, projects must document and follow their process for responding to upstream vulnerabilities in dependencies.

Our mission of developing runtime environments and language toolchains where security, efficiency, and modularity can all coexist necessarily means that we have performed our due dilligence to mitigate software supply chain attacks.

We intend to show how to configure, store, and manage bill of materials - but that's not yet part of the MVP of the sample. We do intend to add this, with the purpose of educating Component authors how to manage SBOMs themselves.

Sustainable Contributor Base

All projects must have regular contributions from multiple contributors.

It is recommended that hosted projects additionally have contributors affiliated with at least two different Bytecode Alliance organizations and that the project's leadership has representation from at least two different Bytecode Alliance organizations.

There must not be any private information necessary to fully contribute to the project.

A project is not considered healthy with only one contributor. An open, community project requires input from multiple stakeholders and does not rely on a single person.

The TSC may waive the above contributor base requirements under certain conditions. In particular, the TSC may decide to adopt crucial upstream dependencies of existing Bytecode Alliance projects that are otherwise effectively unmaintained or only have a single maintainer.

This sample has multiple contributors - albeit all from a single company (Microsoft). We expect more people will contribute to this once it's upstreamed, as for example it shows some of the limitations of cargo-component. If we do it right, we should be able to fix those issues and simultaneously update the sample. Though this sample exists in a separate repo, it is heavily tied to the other projects in the BA.

Version Control

All projects must be hosted on the Bytecode Alliance Organization on GitHub.

Access controls are managed via the Bytecode Alliance organization on GitHub. This allows for continuity of the project when hosted in one place. Finally, this is the only way to reasonable manage the projects within the organization.

Once this project is accepted, we will move the project over.

Recommendations

Changelog

It is recommended that hosted projects highlight key additions, breaking changes, security fixes, and otherwise noteworthy changes in a changelog.

See keepachangelog.com for a recommended approach.

We are building an ecosystem that developers can depend on, and one small part of that is communicating important changes downstream.

We currently don't keep a changelog - though that's something that would be neat to automate as part of the release process. We agree that keeping changelogs is a good thing, and at a minimum want to make it easy for users of the sample to keep one - even if it's just a list of pull requests.

Continuous Fuzzing

Not all projects will necessarily benefit from fuzzing, for example benchmark suites. The TSC may choose lift this requirement for a particular project.

It is recommended that hosted projects have 24/7, round the clock, continuous fuzzing. The fuzzing should exercise significant amounts of the code base and test the project's most important properties, such as sandboxing. Bugs and vulnerabilities discovered via fuzzing should be addressed promptly.

Part of our open-source and open contribution model, the corpus and setup for running fuzzing should be open-sourced as part of the project.

Faults discovered via fuzzing must be reported privately to the project's core team so that the project's security vulnerability process can be followed properly, if necessary. For example, fuzzing infrastructure must not automatically open public issues for any fault that is discovered.

Continuous fuzzing is a valuable practice for projects, due to its significant benefits in improving security and reliability. Within the Bytecode Alliance, we host projects that provide a sandbox. The fidelity of these sandboxes must be battle-tested via a number of methodologies including automated fuzzing.

This is a sample showing how to use other tools. We wouldn't benefit much from directly fuzzing 24/7, instead it seems better to transitively rely on the projects we are showcasing being thoroughly tested and fuzzed.

End-User Documentation

We abide by the OpenSSF requirements for documentation:

The documentation of an external interface explains to an end-user or developer how to use it. This would include its application program interface (API) if the software has one. If it is a library, document the major classes/types and methods/functions that can be called. If it is a web application, define its URL interface (often its REST interface). If it is a command-line interface, document the parameters and options it supports. In many cases it's best if most of this documentation is automatically generated, so that this documentation stays synchronized with the software as it changes, but this isn't required. The project MAY use hypertext links to non-project material as documentation. Documentation MAY be automatically generated (where practical this is often the best way to do so).

Furthermore, we identify a few different types of (sometimes overlapping) documentation:

  • API documentation: Documentation for each type, method, function, and module in a library.
  • Architectural overviews: High-level documentation about the architecture of the project and how it works from a 1000-foot view that helps endusers take advantage of the project in the best way possible and helps onboard new contributors.
  • Examples: Code examples that show off how to use the project as a whole or particular features it supports.
  • Guides and tutorials: Long-form prose, with code samples interspersed, that shows how to accomplish a task using the project.

API and CLI flag documentation is required for hosted projects; all other types are recommended.

Documentation is necessary for end-users to productively use the project; source code comments are not sufficient.

This project itself is documentation in the form of both code and prose.

Issue Triage Process

Hosted projects must use an issue tracker for tracking individual issues.

It is recommended that hosted projects should additionally have a documented process for expeditiously triaging incoming issues and pull requests, and follow that process. Contributors should get prompt responses to their issues and pull requests, even if a response is not an immediate fix or review.

For a successful community-driven project, expedient communication within issues and PRs encourages further collaboration and contribution.

We have an issue tracker.

Leverage the Bytecode Alliance RFC Process

A request for comments (RFC) is a technique for soliciting the community and contributors for feedback on proposed major changes and decisions.

It is recommended that hosted projects follow the Bytecode Alliance RFC process for changes that significantly affect project stakeholders or contributors. The RFCs repo describes when an RFC is needed in more detail:

Many changes to Bytecode Alliance projects can and should happen through every-day GitHub processes: issues and pull requests. An RFC is warranted when:

  • The work involves changes that will significantly affect stakeholders or project contributors. Each project may provide more specific guidance. Examples include:
    • Major architectural changes
    • Major new features
    • Simple changes that have significant downstream impact
    • Changes that could affect guarantees or level of support, e.g. removing support for a target platform
    • Changes that could affect mission alignment, e.g. by changing properties of the security model
  • The work is substantial and you want to get early feedback on your approach.

This is a best practice for aligning contributors, the community, and downstream projects' needs with proposed technical implementations.

TODO: discussion of this recommendation and any supporting evidence (such as links to code, documentation, issues, and pull requests)

Production Use

It is recommended that hosted projects have demonstrated use in production by at least three independent organizations which are, in the TSC's judgement, of adequate quality and scope.

It is recommended that projects track production usage by organizations in an ADOPTERS.md at the root of the project, for example see ADOPTERS.md in Wasmtime.

Projects should demonstrate that they are practical, useful, and reliable enough to use in production.

The purpose of this sample is to increase the production use of existing projects.

Public Project Meetings and Notes

It is recommended that hosted projects hold regular and public project meetings. Meeting times and frequency should be advertised publicly, for example in the project's CONTRIBUTING.md. To avoid spam and "Zoom bombing", the video conferencing link need not be public, but should be available upon request.

Agendas for upcoming meetings and notes from past meetings should be published publicly. The notes should be in the bytecodealliance/meetings repository.

Public meetings encourage open communication, collaboration, and engagement within the project's community. Notes allow community members who were not present to remain aligned and can document any decisions made during the meeting.

Samples probably don't need their own individual project groups, as they intend to reflect the usage and best practices of other, existing tools. Those tools have their own meetings and logs, and we expect most of the conversations and decisions to be made in those groups - and only once completed will those changes be represented in the sample.

Sanitizers and Code Analysis

Static and dynamic code analysis tools (such as valgrind or miri) where applicable are recommended to be used by hosted projects.

It is recommended that hosted projects with non-trivial amounts of unsafe code (e.g. unsafe in Rust or any C/C++) run tests and fuzzers with the relevant sanitizers: Address Sanitizer, Memory Sanitizer, Thread Sanitizer, etc.

Automated code analysis is key to meeting our mission of developing runtime environments and language toolchains where security, efficiency, and modularity can all coexist.

We do not use unsafe code, but we do apply various other static analysis tools such as rustfmt and clippy.

@yoshuawuyts yoshuawuyts changed the title Hosted Project Proposal: rust-wasm-sample Hosted Project Proposal: rust-wasi-sample Nov 20, 2024
@pchickey
Copy link

Since this is wasi-http specific could we rename it to rust-wasi-http-sample?

@yoshuawuyts
Copy link
Member Author

Including the name of the world makes sense to me. I mainly want to make sure we also set ourselves up to host samples for Go, C#, JS, etc. If we generalize that to a scheme, it sounds like that might become something like: {language}-{world}-sample.

@pchickey
Copy link

Do we need a separate project / repo for each of those? Or can we maintain a single samples repo?

@yoshuawuyts
Copy link
Member Author

yoshuawuyts commented Nov 20, 2024

This sample is setup as a GitHub Template: that makes it a single click to start modifying the sample to build your own. I think that's incredibly valuable, and we can't do that well if we put all samples in a monorepo. I think it also makes the sample feel more targeted/realistic if it's language-specific.

@pchickey
Copy link

Ah, ok, I didn't understand that Template was limited in that way, which is sorta a bummer but I guess it makes sense.

@yoshuawuyts
Copy link
Member Author

yoshuawuyts commented Nov 20, 2024

By the way, I'm not sure if this was clear from the description, but I hope that on the hosting side individual host projects will end up creating their own samples to run these applications. E.g. I think it'd be great if there are dedicated samples for {spin, wasmcloud} on {AWS, Azure, Gcloud} and so on. If we can link to these the getting started flow can just become:

  1. pick your language
  2. pick your hosting platform
  3. clone both templates
  4. you're now off to the races :)

With colleagues at Azure we're currently also working on an initial sample for running Wasm HTTP Components on AKS, which should provide an initial end-to-end flow people can use.

@tschneidereit
Copy link
Member

Thank you for the very kind offer to contribute this project to the BA! ❤️ I (for now personally, not speaking on behalf of the TSC) think this would be a great addition.

Organization-wise, I agree with Pat that perhaps this doesn't need to be its own project. The key bit is that BA projects don't have a 1:1 mapping to repositories: a single project can span multiple repos, and a single repo can contain multiple projects.

Given this, would you perhaps be up for generalizing the proposed project definition here a bit to make it describe an umbrella project? A potential idea could also be to bring this up with SIG-Documentation to see if there's interest in helping maintain sample projects.

@tschneidereit
Copy link
Member

We discussed this in the TSC call this week, and have strong agreement that we'd love to have this hosted in the BA.

The details are a bit more open: for one, we agreed that it'd make sense to have this sort of example be part of a larger effort that'd form a single hosted project, instead of going through the application process for each of them. We also agreed that it might indeed make sense to coordinate this with SIG-Documentation, which we could see as a good forum for collaborating on examples in general.

The other, more involved part is that thus far, the BA has been careful about not taking stances on how to use any of the tools and approaches we develop in the context of non-BA tools or platforms. What you're proposing would change that, and is something we'll need to think through a bit more.

Would it be okay for now to start coordinating with SIG-Documentation on how to structure examples like this, while focusing on wasmtime based scenarios instead of ones involving external platforms or tools?

@yoshuawuyts
Copy link
Member Author

yoshuawuyts commented Dec 1, 2024

@tschneidereit thanks for your response! Happy to further generalize the proposal text and happy to talk to SIG-Documentation.

To clarify my earlier comment: I do not expect the BA to maintain any runtime-specific samples. Something like "Spin on Azure Kubernetes Service" would best be maintained by either Azure or Fermyon, not the BA.

What I want the BA to host is canonical guest samples for various programming languages. And potentially also host samples for BA-hosted runtimes. However since WASI is still early in its adoption cycle, I do believe there is a lot of value in creating visibility for samples. And so I think it would be a good idea if we could make sure we are linking to the various samples, even if they're not hosted by the BA.

@tschneidereit
Copy link
Member

What I want the BA to host is canonical guest samples for various programming languages. And potentially also host samples for BA-hosted runtimes. However since WASI is still early in its adoption cycle, I do believe there is a lot of value in creating visibility for samples. And so I think it would be a good idea if we could make sure we are linking to the various samples, even if they're not hosted by the BA.

That all sounds absolutely perfect to me, and is very well aligned with how we talked about it in the last TSC meeting.

To clarify my earlier comment: I do not expect the BA to maintain any runtime-specific samples. Something like "Spin on Azure Kubernetes Service" would best be maintained by either Azure or Fermyon, not the BA.

Thank you for the clarification, that helps!

It seems like we have sorted what we need to be sorted here for now, and the next step is coordination with SIG-Documentation?

@yoshuawuyts
Copy link
Member Author

It seems like we have sorted what we need to be sorted here for now, and the next step is coordination with SIG-Documentation?

Great, I'm glad to hear that! I've reached out to them on Zulip in the #SIG Documentation channel.

@yoshuawuyts yoshuawuyts changed the title Hosted Project Proposal: rust-wasi-sample Hosted Project Proposal: language-specific wasi:http samples Jan 20, 2025
@yoshuawuyts
Copy link
Member Author

yoshuawuyts commented Jan 21, 2025

Based on feedback from the TSC: changed the wording in the proposal to cover more wasi:http samples than just the Rust sample, but starting with the Rust sample. I also communicated with SIG-Docs, and this work is now being tracked in bytecodealliance/component-docs#182.

I think that covers all outstanding comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants