forked from jchambers/id-obfuscator
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
2 changed files
with
134 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,7 @@ | ||
Copyright 2017 Jon Chambers | ||
|
||
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: | ||
|
||
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. | ||
|
||
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,127 @@ | ||
# ID Obfuscator | ||
|
||
ID Obfuscator is a Java library for obfuscating numerical identifiers. Practically, that means taking a number like "17" and transforming it into a string like "GDJSHCX" (and, presumably, turning that string back into a number later). | ||
|
||
## The sales pitch | ||
|
||
In the most common use case, you might have an application where you assign IDs to new users using a counter controlled by a central database. In this scheme, your first user would have an ID of 1, the second user would have an ID of 2, and so on. Ultimately, you might need to expose those identifiers to the public; for example, you might expose user profile pages with URLs like `https://example.com/user/1234`. This raises a few issues, though: | ||
|
||
- It's easy for malicious users to find out exactly how many users you have | ||
- It's easy for malicious users to crawl all your users' profile pages because the URLs are sequential | ||
- A user's URL gives hints as to how long they've been a user, which may be undesirable in some cases | ||
|
||
If we could use URLs that look more like `https://example.com/user/KJVTUYB`, those problems would go away. How can we get there, though? We could consider randomly-generating IDs, but then we'd have to deal with collisions (there's no guarantee we wouldn't randomly generate the same ID twice) and storing a mapping of random IDs to ordinal IDs. We could consider running the ordinal ID through a hash function, but ultimately this isn't too different from randomly generating IDs—it's still a one-way process and doesn't guarantee uniqueness. | ||
|
||
What we'd really like is a system with some key properties: | ||
|
||
- Obfuscated IDs are uniquely mapped to ordinal IDs | ||
- No need to look up obfuscated IDs in a database | ||
- Ordinal/obfuscated ID translation is fast | ||
|
||
This is (surprise!) exactly what ID Obfuscator offers; it allows you to apply fast, reversible transformations to numerical IDs so it's easy for you to get the original ID, but hard for a malicious user to do the same. This allows you to obfuscate IDs without adding new infrastructure or worrying about ID collisions. | ||
|
||
## Using ID Obfuscator | ||
|
||
Let's begin with an example, then break it down: | ||
|
||
```java | ||
final BitRotationIntegerObfuscator rotate = new BitRotationIntegerObfuscator(17); | ||
final OffsetIntegerObfuscator offset = new OffsetIntegerObfuscator(785374208); | ||
final XorIntegerObfuscator xor = new XorIntegerObfuscator(4444266); | ||
final MultiplicativeInverseIntegerObfuscator inverse = | ||
new MultiplicativeInverseIntegerObfuscator(5237459); | ||
|
||
final AlphabetCodec codec = new AlphabetCodec(new AlphabetBuilder() | ||
.includeUppercaseLatinLetters() | ||
.excludeVowels() | ||
.excludeVisuallySimilarCharacters() | ||
.shuffleWithRandomSeed(95839275) | ||
.build()); | ||
|
||
final IntegerObfuscationPipeline pipeline = new IntegerObfuscationPipeline(codec, | ||
rotate, offset, xor, inverse); | ||
|
||
System.out.println("| id | obfuscated id |"); | ||
System.out.println("|----|---------------|"); | ||
|
||
for (int id = 0; id < 10; id++) { | ||
System.out.format("| %d | %s |\n", id, pipeline.obuscate(id)); | ||
assert id == pipeline.deobfuscate(pipeline.obuscate(id)); | ||
} | ||
``` | ||
|
||
The example produces the following output: | ||
|
||
| id | obfuscated id | | ||
|----|---------------| | ||
| 0 | RYRJYLCR | | ||
| 1 | QJRDDJPV | | ||
| 2 | TTQRHYDR | | ||
| 3 | RKZQBNBP | | ||
| 4 | DPXDZVDC | | ||
| 5 | TBLQTFL | | ||
| 6 | RYCTHKQJ | | ||
| 7 | QJFVWXRL | | ||
| 8 | TTKZCWMJ | | ||
| 9 | RZMYYZRX | | ||
|
||
In the above example, there are three major pieces of the puzzle: obfuscators, codecs, and a pipeline. We'll discuss each in turn. | ||
|
||
### The pipeline | ||
|
||
The main point of interaction with ID Obfuscator is the `ObfuscationPipeline`. The pipeline combines a number of obfuscators and exactly one codec into a coherent tool for obfuscating and deobfuscating numbers. The type, number, and configuration of the obfuscators and the type and configuration codec all control the behavior of the pipeline. As an example, let's change the order of the obfuscators in the demo above to `offset, rotate, xor, inverse` (i.e. we swap the positions of `offset` and `rotate`). Now the output looks like this: | ||
|
||
| id | obfuscated id | | ||
|----|---------------| | ||
| 0 | DRTPBTCH | | ||
| 1 | DZNMZKTM | | ||
| 2 | WYFQDKR | | ||
| 3 | RLDBHFLT | | ||
| 4 | QFDNWZQR | | ||
| 5 | TWTLCHBT | | ||
| 6 | DRVJYCQC | | ||
| 7 | DVYHLLBV | | ||
| 8 | WFWMBTJ | | ||
| 9 | RLKJKQFP | | ||
|
||
This is, obviously, very different from the original output. We could achieve similar output changes by changing the value of the offset passed to the `OffsetIntegerObfuscator`, for example, or changing the random seed passed to the codec. This has two very important consequences: | ||
|
||
1. A malicious user needs to know the exact type, order, and configuration of the obfuscators and codec in your pipeline in order to turn obfuscated IDs into their original numerical representations. | ||
2. *You* need to know the exact type, order, and configuration of the obfuscators and codec in your pipeline in order to turn obfuscated IDs into their original numerical representations. | ||
|
||
It's extremely important that you hold on to the "shape" and configuration of your pipeline once you start obfuscating IDs; if you lose it, you won't be able to deobfuscate your own IDs. Similarly, you *absolutely should not* randomly-generate pipeline parameters at runtime, because there's no guarantee they'll be the same from one run to the next. In other words, this is fine: | ||
|
||
```java | ||
offset = new OffsetIntegerObfuscator(785374208); | ||
``` | ||
|
||
…but this is an extremely bad idea: | ||
|
||
```java | ||
offset = new OffsetIntegerObfuscator(new SecureRandom().nextInt()); | ||
``` | ||
|
||
### Obfuscators | ||
|
||
Obfuscators reversibly transform one number into another number. As a trivial example, an obfuscator might transform an a number by adding 27 to it, and then later reverse the transformation by subtracting 27. As shown in the example above, ID Obfuscator provides a number of obfuscators out of the box, and each is configurable (so your `OffsetIntegerObfuscator` may be very different from somebody else's). | ||
|
||
Some of the obfuscators available out of the box include: | ||
|
||
- `BitRotationIntegerObfuscator` performs a [circular shift](https://en.wikipedia.org/wiki/Circular_shift) of configurable distance on the bits in a number | ||
- `MultiplicativeInverseIntegerObfuscator` obfuscates numbers by multiplying them by a "secret" you provide, then deobfuscates them by multiplying by the [multiplicative inverse](https://ericlippert.com/2013/11/12/math-from-scratch-part-thirteen-multiplicative-inverses/) of the secret | ||
- `OffsetIntegerObfuscator` obfuscates a number by adding a "secret" you provide, then deobfuscates by subtracting the secret | ||
- `XorIntegerObfuscator` obfuscates and deobfuscates numbers by applying a [bitwise XOR](https://en.wikipedia.org/wiki/Bitwise_operations_in_C#Bitwise_XOR_.22.5E.22) operation with a "secret" you provide | ||
|
||
You're free to add your own obfuscators, too! | ||
|
||
### Codecs | ||
|
||
A codec takes a (possibly obfuscated) number and represents it as a string. Later, it can transform that string back into the original number. Codecs may provide a measure of obfuscation in their own right; for example, it might just represent the number as a decimal string, but shuffle the digits. You could, in principle, have a pipeline that has no obfuscators and only has a codec. | ||
|
||
ID Obfuscator comes with `AlphabetCodec`, which uses an alphabet you provide to represent numbers as strings, but you can certainly provide your own codec, too. | ||
|
||
## The details | ||
|
||
ID Obfuscator is just that: an obfuscator. It makes it difficult for malicious users to figure out how to turn an obfuscated ID into a "real" ID, but not impossible. Caveat emptor. | ||
|
||
Currently, ID Obfuscator works with 32-bit integer IDs, but is likely to be extended to include larger data types in the future. |