Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bitfield Domain #1623

Open
wants to merge 172 commits into
base: master
Choose a base branch
from
Open

Bitfield Domain #1623

wants to merge 172 commits into from

Conversation

ManuelLerchner
Copy link

@ManuelLerchner ManuelLerchner commented Nov 5, 2024

This pull request enhances the IntDomain by introducing a new BitfieldDomain capable of tracking bitwise operations.
Additionally, it adds all newly possible refinement directions to improve precision in the IntDomain.

This implementation is part of the Automated Bug Hunting and Beyond practical course at TUM.

Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semgrep OSS found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

@ManuelLerchner ManuelLerchner marked this pull request as ready for review January 20, 2025 17:32
Comment on lines +401 to +409
let a', b' = ID.meet a (ID.logand a c), ID.meet b (ID.logand b c) in
let (cz, co) = ID.to_bitfield ikind c in
let (az, ao) = ID.to_bitfield ikind a' in
let (bz, bo) = ID.to_bitfield ikind b' in
let cDef1 = Z.logand co (Z.lognot cz) in
let aDef0 = Z.logand az (Z.lognot ao) in
let bDef0 = Z.logand bz (Z.lognot bo) in
let az = Z.logand az (Z.lognot (Z.logand bDef0 cDef1)) in
let bz = Z.logand bz (Z.lognot (Z.logand aDef0 cDef1)) in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should most of these computations not be moved inside your domain? (After getting the bitfields).

This works on the internal representation of these bitfield things, so maybe should be in there. Granted, the intervals also leak outside of their domain,but people intuitively understand how they work.

Also, please leave some comment here about what happens. One can puzzle it together, but it took @DrMichaelPetter and me about 10 mins yesterday to do so.

Comment on lines +432 to +433
| None -> a)
| _ -> a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The tracing output should still be produced if bitfield is off.

Comment on lines +415 to +416
let a' = ID.meet a (ID.logxor c b)
in let b' = ID.meet b (ID.logxor a c)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change to use standard indentation.

let ... = ... in

Comment on lines +435 to +446
if PrecisionUtil.get_bitfield () then
let a', b' = ID.meet a (ID.logor a c), ID.meet b (ID.logor b c) in
let (cz, co) = ID.to_bitfield ikind c in
let (az, ao) = ID.to_bitfield ikind a' in
let (bz, bo) = ID.to_bitfield ikind b' in
let cDef0 = Z.logand cz (Z.lognot co) in
let aDef1 = Z.logand ao (Z.lognot az) in
let bDef1 = Z.logand bo (Z.lognot bz) in
let ao = Z.logand ao (Z.lognot (Z.logand bDef1 cDef0)) in
let bo = Z.logand bo (Z.lognot (Z.logand aDef1 cDef0)) in
ID.meet a' (ID.of_bitfield ikind (az, ao)), ID.meet b' (ID.of_bitfield ikind (bz, bo))
else a, b
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above.

Comment on lines +144 to +145
match x with None -> (Z.zero, Z.zero) | Some (c,m) ->
if m = Z.zero then (Z.lognot c, c)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please indent as done in other places in Goblint.

Comment on lines +146 to +151
else if is_power_of_two m then
let mod_mask = m -: Z.one in
let z = Z.lognot c in
let o = Z.logor (Z.lognot mod_mask) c in
(z,o)
else (Z.lognot Z.zero, Z.lognot Z.zero)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does one need to consider the signedness here?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure whether you mean the signedess of c and m or of the returned bitfield. But in both cases, we do not have to consider it explicitly. normalize always returns positive values for c and m if m is unequal to zero. Additionally, a possible sign bit is set to top in the returned bitfield representation.

Comment on lines 503 to 505
let refine_with_bitfield ik a (z,o) =
if Z.lognot z = o then meet ik a (Some (o, Z.zero))
else a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the only case where we can benefit? Naively, one thinks we should be able to benefit along the reasoning of the to_bitfield above?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We benefit along the reasoning of the to_bitfield in the refine_with_congruence of the bitfield domain because we transform a value from the congruence into the bitfield domain. Unfortunately, I don't see how we can use similar tricks for the converse direction.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naively I would think that knowing that the last k bits are must-zero would give rise to a
congruence of the type z mod 2^(k) = 0, at least in the case where we also know that the values are positive (because of the ik, or because the MSB is not set)?
But maybe that's wrong?

Comment on lines +302 to +306
let to_bitfield ik x =
match x with
`Definite c -> (Z.lognot c, c) |
_ -> let one_mask = Z.lognot Z.zero
in (one_mask, one_mask)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation

Comment on lines +539 to +541
let refine_with_bitfield ik x (z,o) =
if Z.lognot z = o then meet ik x (`Definite o)
else x
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you not need to consider the ik here?

Comment on lines +253 to +262
let to_bitfield ik x =
match x with
Inc i when BISet.is_empty i -> (Z.zero, Z.zero) |
Inc i when BISet.is_singleton i ->
let o = BISet.choose i
in (Z.lognot o, o) |
Inc i -> BISet.fold (fun o (az, ao) -> (Z.logor (Z.lognot o) az, Z.logor o ao)) i (Z.zero, Z.zero) |
_ -> let one_mask = Z.lognot Z.zero
in (one_mask, one_mask)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this indentation it is near impossible to review!

@@ -1,4 +1,5 @@
open IntDomain0
open GoblintCil
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where are you using this? If it's only very few places, one may not want to open it.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use it for the isSigned function of ikinds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, since it's only used three times (and only in one function), I think i would personally prefer getting rid of the open and qualifying those references with GoblintCil.isSigned or using a local open in that function.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines +256 to +259
let rec from_list is acc = match is with
[] -> acc |
j::js -> from_list js (joinbf acc (Interval.to_bitfield ik (Some j)))
in from_list x (Ints_t.zero, Ints_t.zero)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this not just a fold?

Comment on lines +133 to +138
`top
else
if Ints_t.equal (Ints_t.logand (Ints_t.shift_right startv pos) Ints_t.one) Ints_t.one then
`one
else
`zero
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

where do these polymorphic variants now come from?

Comment on lines +60 to +61
annotation_int_enabled := None;
bitfield := None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please swap

@@ -380,7 +380,7 @@ struct
let test_domain (module D: Lattice.S): unit =
let module DP = DomainProperties.All (D) in
Logs.debug "domain testing...: %s" (D.name ());
let errcode = QCheck_base_runner.run_tests DP.tests in
let errcode = QCheck_base_runner.run_tests DP.tests ~verbose:true in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why verbose?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious if you've actually used this and found anything with this. Domain tests for individual integer domains are part of unit tests where they are directly triggered. This attempts to test the full domain used by the configured analysis, but I suspect this doesn't have arbitrary-s defined all the way down to the integer domains through all the functors. For that reason I've even considered removing this feature altogether, but if it has actually been useful then let me know.

@@ -0,0 +1,29 @@
// PARAM: --enable ana.int.bitfield
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general it seems very few of your pograms tests actually test the join of your domain. That would be something that probably should be added.

Comment on lines +104 to +105
| (_, _, _, _, _, Some true)
-> true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please stick with the normal indentation.

Comment on lines +106 to +119
let rec concretize (z,o) = (* O(2^n) *)
if is_const (z,o) then [o]
else
let bit = o &: Ints_t.one in
let bf_bit = nth_bf_bit (z,o) 0 in
concretize (z >>. 1, o >>: 1) |>
if bf_bit = `Undetermined then
List.concat_map (fun c -> [c <<: 1; (c <<: 1) |: Ints_t.one])
else if bf_bit = `Invalid then
failwith "Should not have happened: Invalid bit during concretization of a bitfield."
else
List.map (fun c -> c <<: 1 |: bit)

let concretize bf = List.map Ints_t.to_int (concretize bf)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where is this used?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'concretize' method is used inside the shift operations (shift_left, shift_right) as it returns the values which needs to be shifted.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fully concretizing an abstract value is not something that should be done lightly, as you have noted yourself. If you do a shift and know nothing about the operand (which can happen), this operation alone will completely kill the chance of ever terminating, already from constructing this list of length 2^64.

Does your domain not have an efficient way of computing $$| \gamma (x) |$$ by counting the number n of zero bits in z xor x (size 2^n)? In this case, you should only do something for small n (maybe up to 3) to stay tractable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see you actually bound the argument in all calls with the bitwidth, so it is not too problematic. Probably you should make a comment here as well then that the function is not a reasonable shift function for any arguments, but only for second arguments with small(!) concretizations.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure that makes sense, I will add comments that make this more obvious.

let bot () = (BArith.zero_mask, BArith.zero_mask)

let top_of ik =
if isSigned ik then top ()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think i understand how you handle ikinds in general here.

  • Do you norm the upper bits?
  • If not do you not risk having values that are semantically equal but syntactically not?
  • What happens for ikinds that are wider than the BArith.one_mask?

Why can this not work for signed types? And does this not mean you find out nothing as soon as any computations happen in the program (even for unsigned types)?

C99 6.3.1.

If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer
promotions. All other types are unchanged by the integer promotions.

Comment on lines +181 to +182
let o_mask = ref o in
let z_mask = ref z in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why ref?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was not necessary to use references in this case, therefore the code was changed and the ref removed.

let range ik bf = (BArith.min ik bf, BArith.max ik bf)

let maximal (z,o) =
if (z < Ints_t.zero) <> (o < Ints_t.zero) then Some o
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain this?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We basically check whether the sign bit is clear, i.e., whether we only represent positive or negative numbers. Under such a condition, the maximal represented number is just the number with the maximal number of ones.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that works if the ikind and the type of Ints_t coincide. But what if you have, e.g., the ikind of signed char? There you would need to check the 7th bit right?

I wasn't there for the first couple of meetings, so maybe you'll need to enlighten us tomorrow how you deal with different ikinds in your domain in general 😃

Copy link
Member

@michael-schwarz michael-schwarz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! I left some first comments in some places, but have not worked through the entire thing.

Copy link
Member

@jerhard jerhard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR! I left some comments below.

Comment on lines +104 to +107
let to_bitfield ik z =
match z with None -> (Ints_t.lognot Ints_t.zero, Ints_t.lognot Ints_t.zero) | Some (x,y) ->
let (min_ik, max_ik) = Size.range ik in
let startv = Ints_t.max x (Ints_t.of_bigint min_ik) in
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you make this function easier to read?
A comment about what is done here overall may help. Possibly one may also find a more telling name for the helper function analyze_bits?

let isNegative = Ints_t.logand signBit o <> Ints_t.zero in
if isSigned ik && isNegative then Ints_t.logor signMask (Ints_t.lognot z)
else Ints_t.lognot z
in let max ik (z,o) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indentation.

Comment on lines +241 to +242
let signBit = Ints_t.shift_left Ints_t.one ((Size.bit ik) - 1) in
let signMask = Ints_t.lognot (Ints_t.of_bigint (snd (Size.range ik))) in
Copy link
Member

@jerhard jerhard Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

signBit and signMask only seem to depend on ik. Can one extract them from min and max, so they don't have to be defined twice?

assert_shift_right ~rev_cond:true ik (`I [max_of ik]) top top; (* the sign bit shouldn't be set with right shifts if its unset *)

assert_shift_right ik (`I [2]) (`I [-1]) top; (* Negative shifts are undefined behavior *)
(*assert_shift_right ik (`I [min_of ik]) top top;*) (*TODO implementation-defined sign-bit handling *)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is this todo about?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test case could be removed if you agree with our design choice to handle right shifts as arithmetic ones.
Given the C standard's definition of right shifts on signed integers, it might be better to leave the upper bits undetermined. In that case this test case should pass.

Comment on lines +87 to +90
| true, true -> `Undetermined
| false, false -> `Invalid
| true, false -> `Zero
| false, true -> `One
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a specific reason you are using polymorphic variants? If not, it would be preferable to define a variant type. See also here.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, they were naively chosen to be used for readability purposes here but not necessary in any way. This part will be removed.

else None

let minimal (z,o) =
if (z < Ints_t.zero) <> (o < Ints_t.zero) then Some (!:z)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the polymorphic compare < here probably works, but the compare on Ints_t is probably still the better choice?

Comment on lines +253 to +254
(* (M.warn ~category:M.Category.Integer.overflow "Bitfield: Value was outside of range, indicating overflow, but 'sem.int.signed_overflow' is 'assume_none' -> Returned Top"; *)
(* (bot (), overflow_info)) *)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Commented out code should be removed in the final version.


(* Conversions *)

let of_interval ?(suppress_ovwarn=false) ik (x,y) =
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this function share code with the to_bitfield function in the interval domain? If so, could this be deduplicated?

Comment on lines +300 to +306
if Ints_t.compare bigger_number endv <= 0 then
`top
else
if Ints_t.equal (Ints_t.logand (Ints_t.shift_right startv pos) Ints_t.one) Ints_t.one then
`one
else
`zero
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here, one could also define a variant type, instead of using a polymorphic variant type.

Comment on lines +599 to +601
(* Unit Tests *)

let arbitrary ik =
Copy link
Member

@jerhard jerhard Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

arbitrary is used for unit tests, indeed; still the comment may be a bit misleading, since there are no unit tests following the comment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants