From 889685274438ca20344d4d9cb472e4392c4e35a9 Mon Sep 17 00:00:00 2001 From: Sean Baxter Date: Fri, 8 Nov 2024 13:36:53 -0500 Subject: [PATCH] Fixed some typos in draft.html --- docs/draft.html | 24 ++++++++++++------------ proposal/draft.md | 16 ++++++++-------- 2 files changed, 20 insertions(+), 20 deletions(-) diff --git a/docs/draft.html b/docs/draft.html index f8b6b6b..ff77e28 100644 --- a/docs/draft.html +++ b/docs/draft.html @@ -926,7 +926,7 @@

1.5 or a linear type system. Unless explicitly initialized, objects start out uninitialized. They can’t be used in this state. When you assign to an object, it becomes initialized. When you relocate from -an object, it’s value is moved and it’s reset to uninitialized. If you +an object, its value is moved and it’s reset to uninitialized. If you relocate from an object inside control flow, it becomes potentially uninitialized, and its destructor is conditionally executed after reading a compiler-generated drop flag.

@@ -1341,9 +1341,9 @@

establishes rules for where library code must insert panic calls. If a function is marked safe but is internally unsound for some values of its arguments, it should check those arguments and panic before executing -the unsafe operation. Unsafe functions generally don’t panic because its -the responsibility of their callers to observe the preconditions of the -function.

+the unsafe operation. Unsafe functions generally don’t panic because +it’s the responsibility of their callers to observe the preconditions of +the function.

2 Design overview

2.1 The safe context

@@ -1892,7 +1892,7 @@

std::string_view was added to C++17 as a safer alternative to passing character pointers around. Unfortunately, its rvalue-reference constructor is so -dangerously designed that its reported to encourage +dangerously designed that it’s reported to encourage use-after-free bugs.[string-view-use-after-free]

string_view0.cxx(Compiler Explorer)

@@ -2560,7 +2560,7 @@

ref, we see it’s live -at points ’R0 = { 4, 8, 9, 10, 11 }. These are the points where its +at points ’R0 = { 4, 8, 9, 10, 11 }. These are the points where it’s subsequently used. We’ll grow the loan regions ’R1 and ’R2 until their constraint equations are satisfied.

'R1 : 'R0 @ P3 @@ -2642,8 +2642,8 @@

+

2.2.7 Lifetime constraints on +called functions

Borrow checking is easiest to understand when applied to a single function. The function is lowered to a control flow graph, the compiler assigns regions to loans and borrow variables, emits lifetime @@ -4343,8 +4343,8 @@

const or mut, also determines exclusivity. But the cast-away-const model of interior mutability is an awkward @@ -4485,7 +4485,7 @@

2.7 T is send. Since most types are send by construction, we can safely -mutate shared state over multiple threads as long as its wrapped in a +mutate shared state over multiple threads as long as it’s wrapped in a std2::mutex and that’s owned by an std2::arc. @@ -4798,7 +4798,7 @@

[sil] Rust passes through MIR,[mir] and Circle passes through it’s +SIL,[sil] Rust passes through MIR,[mir] and Circle passes through its mid-level IR when targeting the new object model. There is an effort called ClangIR[clangir] to lower Clang to an MLIR dialect called CIR, but the project is in an early phase and doesn’t diff --git a/proposal/draft.md b/proposal/draft.md index 25c7427..6be7b90 100644 --- a/proposal/draft.md +++ b/proposal/draft.md @@ -210,7 +210,7 @@ The "billion-dollar mistake" is a type safety problem. Consider `std::unique_ptr As Hoare observes, the problem comes from conflating two different things, a pointer to an object and an empty state, into the same type and giving them the same interface. Smart pointers should only hold valid pointers. Denying the null state eliminates undefined behavior. -We address the type safety problem by overhauling the object model. Safe C++ features a new kind of move: [_relocation_](#relocation-object-model), also called _destructive move_. The object model is called an _affine_ or a _linear_ type system. Unless explicitly initialized, objects start out _uninitialized_. They can't be used in this state. When you assign to an object, it becomes initialized. When you relocate from an object, it's value is moved and it's reset to uninitialized. If you relocate from an object inside control flow, it becomes _potentially uninitialized_, and its destructor is conditionally executed after reading a compiler-generated drop flag. +We address the type safety problem by overhauling the object model. Safe C++ features a new kind of move: [_relocation_](#relocation-object-model), also called _destructive move_. The object model is called an _affine_ or a _linear_ type system. Unless explicitly initialized, objects start out _uninitialized_. They can't be used in this state. When you assign to an object, it becomes initialized. When you relocate from an object, its value is moved and it's reset to uninitialized. If you relocate from an object inside control flow, it becomes _potentially uninitialized_, and its destructor is conditionally executed after reading a compiler-generated drop flag. `std2::box` is our version of `unique_ptr`. It has no null state. There's no default constructor. Dereference it without risk of undefined behavior. If this design is so much safer, why doesn't C++ simply introduce its own fixed `unique_ptr` without a null state? Blame C++11 move semantics. @@ -530,7 +530,7 @@ public: }; ``` -The [safety model](#memory-safety-as-terms-and-conditions) establishes rules for where library code must insert panic calls. If a function is marked safe but is internally unsound for some values of its arguments, it should check those arguments and panic before executing the unsafe operation. Unsafe functions generally don't panic because its the responsibility of their callers to observe the preconditions of the function. +The [safety model](#memory-safety-as-terms-and-conditions) establishes rules for where library code must insert panic calls. If a function is marked safe but is internally unsound for some values of its arguments, it should check those arguments and panic before executing the unsafe operation. Unsafe functions generally don't panic because it's the responsibility of their callers to observe the preconditions of the function. # Design overview @@ -925,7 +925,7 @@ Garbage collection requires storing objects on the _heap_. But C++ is about _man ### Use-after-free -`std::string_view` was added to C++17 as a safer alternative to passing character pointers around. Unfortunately, its rvalue-reference constructor is so dangerously designed that its reported to _encourage_ use-after-free bugs.[@string-view-use-after-free] +`std::string_view` was added to C++17 as a safer alternative to passing character pointers around. Unfortunately, its rvalue-reference constructor is so dangerously designed that it's reported to _encourage_ use-after-free bugs.[@string-view-use-after-free] [**string_view0.cxx**](https://github.com/cppalliance/safe-cpp/blob/master/proposal/string_view0.cxx) -- [(Compiler Explorer)](https://godbolt.org/z/e3TG6W5Me) ```cpp @@ -1386,7 +1386,7 @@ P11: f(*ref); } ``` -I've relabelled the example to show function points and region names of variables and loans. If we run live analysis on 'R0, the region for the variable `ref`, we see it's live at points 'R0 = { 4, 8, 9, 10, 11 }. These are the points where its subsequently used. We'll grow the loan regions 'R1 and 'R2 until their constraint equations are satisfied. +I've relabelled the example to show function points and region names of variables and loans. If we run live analysis on 'R0, the region for the variable `ref`, we see it's live at points 'R0 = { 4, 8, 9, 10, 11 }. These are the points where it's subsequently used. We'll grow the loan regions 'R1 and 'R2 until their constraint equations are satisfied. `'R1 : 'R0 @ P3` means that starting at P3, the 'R1 contains all points 'R0 does, along all control flow paths, as long as 'R0 is live. 'R1 = { 3, 4 }. Grow 'R2 the same way: 'R2 = { 7, 8, 9, 10, 11 }. @@ -1440,7 +1440,7 @@ Circle tries to identify all three of these points when forming borrow checker e The invariants that are tested are established with a network of lifetime constraints. It might not be the case that the invalidating action is obviously related to either the place of the loan or the use that extends the loan. More completely describing the chain of constraints could help users diagnose borrow checker errors. But there's a fine line between presenting an error like the one above, which is already pretty wordy, and overwhelming programmers with information. -### Lifetime constraints on called functinos +### Lifetime constraints on called functions Borrow checking is easiest to understand when applied to a single function. The function is lowered to a control flow graph, the compiler assigns regions to loans and borrow variables, emits lifetime constraints where there are assignments, iteratively grows regions until the constraints are solved, and walks the instructions, checking for invalidating actions on loans in scope. Within the definition of the function, there's nothing it can't analyze. The complexity arises when passing and receiving borrows through function calls. @@ -2555,7 +2555,7 @@ Lifetime safety also guarantees that the `lock_guard` is in scope (meaning the m Interior mutability is a legal loophole around exclusivity. You're still limited to one mutable borrow or any number of shared borrows to an object. Types with a deconfliction strategy use `unsafe_cell` to safely strip the const off shared borrows, allowing users to mutate the protected resource. -Safe C++ and Rust conflate exclusive access with mutable borrows and shared access with const borrows. It's is an economical choice, because one type qualifier, `const` or `mut`, also determines exclusivity. But the cast-away-const model of interior mutability is an awkward consequence. This design may not be the only way: The Ante language[@ante] experiments with separate `own mut` and `shared mut` qualifiers. That's really attractive, because you're never mutating something through a const reference. This three-state system doesn't map onto C++'s existing type system as easily, but that doesn't mean the const/mutable borrow treatment, which does integrate elegantly, is the most expressive. A `shared` type qualifier merits investigation during the course of this project. +Safe C++ and Rust conflate exclusive access with mutable borrows and shared access with const borrows. It's an economical choice, because one type qualifier, `const` or `mut`, also determines exclusivity. But the cast-away-const model of interior mutability is an awkward consequence. This design may not be the only way: The Ante language[@ante] experiments with separate `own mut` and `shared mut` qualifiers. That's really attractive, because you're never mutating something through a const reference. This three-state system doesn't map onto C++'s existing type system as easily, but that doesn't mean the const/mutable borrow treatment, which does integrate elegantly, is the most expressive. A `shared` type qualifier merits investigation during the course of this project. * `T^` - Exclusive mutable access. Permits standard conversion to `shared T^` and `const T^`. * `shared T^` - Shared mutable access. Permits standard conversion to `const T^`. Only types that enforce interior mutability have overloads with shared mutable access. @@ -2610,7 +2610,7 @@ class [[ `std2::mutex` is another candidate for use with `std2::arc`. This type is thread safe. As shown in the [thread safety](#thread-safety) example, it provides threads with exclusive access to its interior data using a synchronization object. The borrow checker prevents the reference to the inner data from being used outside of the mutex's lock. Therefore, `std2::mutex` is `sync` if its inner type is `send`. Why make it conditional on `send` when the mutex is already providing threads with exclusive access to the inner value? This provides protection for the rare type with thread affinity. A type is `send` if it can both be copied to a different thread _and used_ by a different thread. -`std2::arc>` is `send` if `std2::mutex` is `send` and `sync`. `std2::mutex` is `send` and `sync` if `T` is `send`. Since most types are `send` by construction, we can safely mutate shared state over multiple threads as long as its wrapped in a `std2::mutex` and that's owned by an `std2::arc`. The `arc` provides shared ownership. The `mutex` provides shared mutation. +`std2::arc>` is `send` if `std2::mutex` is `send` and `sync`. `std2::mutex` is `send` and `sync` if `T` is `send`. Since most types are `send` by construction, we can safely mutate shared state over multiple threads as long as it's wrapped in a `std2::mutex` and that's owned by an `std2::arc`. The `arc` provides shared ownership. The `mutex` provides shared mutation. ```cpp class thread { @@ -2773,7 +2773,7 @@ We should also revise the policy for using lifetime parameters in class definiti # Implementation guidance -The intelligence behind the _ownership and borrowing_ safety model resides in the compiler's middle-end, in its _MIR analysis_ passes. The first thing compiler engineers should focus on when pursuing memory safety is to lower their frontend's AST to MIR. Several compiled languages already pass through a mid-level IR: Swift passes through SIL,[@sil] Rust passes through MIR,[@mir] and Circle passes through it's mid-level IR when targeting the new object model. There is an effort called ClangIR[@clangir] to lower Clang to an MLIR dialect called CIR, but the project is in an early phase and doesn't have enough coverage to support the language or library features described in this document. +The intelligence behind the _ownership and borrowing_ safety model resides in the compiler's middle-end, in its _MIR analysis_ passes. The first thing compiler engineers should focus on when pursuing memory safety is to lower their frontend's AST to MIR. Several compiled languages already pass through a mid-level IR: Swift passes through SIL,[@sil] Rust passes through MIR,[@mir] and Circle passes through its mid-level IR when targeting the new object model. There is an effort called ClangIR[@clangir] to lower Clang to an MLIR dialect called CIR, but the project is in an early phase and doesn't have enough coverage to support the language or library features described in this document. The AST->MIR and MIR->LLVM pipelines (or whatever codegen is used) fully replaces the compiler's old AST->LLVM codegen. It is more difficult to lower through MIR than directly emitting LLVM, but implementing new codegen is not a very large investment. You can look into Circle's MIR support with the `-print-mir` and `-print-mir-drop` cmdline options, which print the MIR before and after drop elaboration, respectively.