Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft: Add more encodings #45

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from
Draft

Conversation

BlackthornYugen
Copy link

@BlackthornYugen BlackthornYugen commented Jun 16, 2024

New endpoints:
/encoding/<charset>
/encoding/<charset>/<base64body>

You could generate some url safe encodings like this:

 for charset in utf-8 utf-16 utf-32 Shift_JIS EUC-JP; do 
    echo "https://httpbin.jskw.dev/encoding/${charset}/$(echo 'HTTPBIN は最高です🎉' | iconv -s -f 'utf-8' -t ${charset} 2> /dev/null | base64 | tr '/+' '_-')"
 done

That script generates these URLs:

And the browser (or any client) should be able to render the same chars, with the exception of the unicode specific '🎉'.

The existing endpoint https://httpbin.dev/encoding/utf8 continues to work the same with this change at https://httpbin.jskw.dev/encoding/utf8 -- and I also have it re-encode the demo into utf16 and utf32. It's interesting that Firefox does a better job at rendering utf16 but both Chrome and Firefox don't do utf32 at all.

To be fair the utility of utf16 and utf32 doesn't really make sense in the browser, it would only really make sense on specialized clients that need to be able to seek through unicode data without variable length encoding of the larger codepoints.

I've found that Firefox does pretty well with both utf-8 and utf-16 but not utf-32. Chrome makes a bit of a mess of the utf-8 demo re-encoded to utf-8.

@BlackthornYugen
Copy link
Author

BlackthornYugen commented Jun 17, 2024

for charset in US-ASCII ISO-8859-1 Windows-1252 UTF-8 UTF-16 UTF-32 ; do 
    curl  --write-out '%{stderr}\n%{url_effective}\n' --silent "https://httpbin.jskw.dev/encoding/${charset}/$(echo 'Hello World' | iconv -s -f 'utf-8' -t ${charset} 2> /dev/null | base64 | tr '/+' '_-')" | xxd  
 done
https://httpbin.jskw.dev/encoding/US-ASCII/SGVsbG8gV29ybGQK
00000000: 4865 6c6c 6f20 576f 726c 640a            Hello World.

https://httpbin.jskw.dev/encoding/ISO-8859-1/SGVsbG8gV29ybGQK
00000000: 4865 6c6c 6f20 576f 726c 640a            Hello World.

https://httpbin.jskw.dev/encoding/Windows-1252/SGVsbG8gV29ybGQK
00000000: 4865 6c6c 6f20 576f 726c 640a            Hello World.

https://httpbin.jskw.dev/encoding/UTF-8/SGVsbG8gV29ybGQK
00000000: 4865 6c6c 6f20 576f 726c 640a            Hello World.

https://httpbin.jskw.dev/encoding/UTF-16/_v8ASABlAGwAbABvACAAVwBvAHIAbABkAAo=
00000000: feff 0048 0065 006c 006c 006f 0020 0057  ...H.e.l.l.o. .W
00000010: 006f 0072 006c 0064 000a                 .o.r.l.d..

https://httpbin.jskw.dev/encoding/UTF-32/AAD-_wAAAEgAAABlAAAAbAAAAGwAAABvAAAAIAAAAFcAAABvAAAAcgAAAGwAAABkAAAACg==
00000000: 0000 feff 0000 0048 0000 0065 0000 006c  .......H...e...l
00000010: 0000 006c 0000 006f 0000 0020 0000 0057  ...l...o... ...W
00000020: 0000 006f 0000 0072 0000 006c 0000 0064  ...o...r...l...d
00000030: 0000 000a                                ....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant