Binary Data in JSON String. Something better than Base64
Binary Data in JSON String. Something better than Base64
The JSON format natively doesn't support binary data. The binary data has to be escaped so that it can be placed into a string element (i.e. zero or more Unicode chars in double quotes using backslash escapes) in JSON.
An obvious method to escape binary data is to use Base64. However, Base64 has a high processing overhead. Also it expands 3 bytes into 4 characters which leads to an increased data size by around 33%.
One use case for this is the v0.8 draft of the CDMI cloud storage API specification. You create data objects via a REST-Webservice using JSON, e.g.
PUT /MyContainer/BinaryObject HTTP/1.1 Host: cloud.example.com Accept: application/vnd.org.snia.cdmi.dataobject+json Content-Type: application/vnd.org.snia.cdmi.dataobject+json X-CDMI-Specification-Version: 1.0 { "mimetype" : "application/octet-stream?, "metadata" : [ ], "value" : "TWFuIGlzIGRpc3Rpbmd1aXNoZWQsIG5vdCBvbmx5IGJ5IGhpcyByZWFzb24sIGJ1dCBieSB0aGlz IHNpbmd1bGFyIHBhc3Npb24gZnJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpcyBhIGx1c3Qgb2Yg dGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlcmFuY2Ugb2YgZGVsaWdodCBpbiB0aGUgY29udGlu dWVkIGFuZCBpbmRlZmF0aWdhYmxlIGdlbmVyYXRpb24gb2Yga25vd2xlZGdlLCBleGNlZWRzIHRo ZSBzaG9ydCB2ZWhlbWVuY2Ugb2YgYW55IGNhcm5hbCBwbGVhc3VyZS4=", }
Are there better ways and standard methods to encode binary data into JSON strings?
Answer by richardtallent for Binary Data in JSON String. Something better than Base64
yEnc might work for you:
http://en.wikipedia.org/wiki/Yenc
Answer by jsoverson for Binary Data in JSON String. Something better than Base64
Since you're looking for the ability to shoehorn binary data into a strictly text-based and very limited format, I think Base64's overhead is minimal compared to the convenience you're expecting to maintain with JSON. If processing power and throughput is a concern, then you'd probably need to reconsider your file formats.
Answer by a paid nerd for Binary Data in JSON String. Something better than Base64
(Edit 7 years later: Google Gears is gone. Ignore this answer.)
The Google Gears team ran into the lack-of-binary-data-types problem and has attempted to address it:
JavaScript has a built-in data type for text strings, but nothing for binary data. The Blob object attempts to address this limitation.
Maybe you can weave that in somehow.
Answer by hobbs for Binary Data in JSON String. Something better than Base64
There are 94 Unicode characters which can be represented as one byte according to the JSON spec (if your JSON is transmitted as UTF-8). With that in mind, I think the best you can do space-wise is base85 which represents four bytes as five characters. However, this is only a 7% improvement over base64, it's more expensive to compute, and implementations are less common than for base64 so it's probably not a win.
You could also simply map every input byte to the corresponding character in U+0000-U+00FF, then do the minimum encoding required by the JSON standard to pass those characters; the advantage here is that the required decoding is nil beyond builtin functions, but the space efficiency is bad -- a 105% expansion (if all input bytes are equally likely) vs. 25% for base85 or 33% for base64.
Final verdict: base64 wins, in my opinion, on the grounds that it's common, easy, and not bad enough to warrant replacement.
Answer by StaxMan for Binary Data in JSON String. Something better than Base64
While it is true that base64 has ~33% expansion rate, it is not necessarily true that processing overhead is significantly more than this: it really depends on JSON library/toolkit you are using. Encoding and decoding are simple straight-forward operations, and they can even be optimized wrt character encoding (as JSON only supports UTF-8/16/32) -- base64 characters are always single-byte for JSON String entries. For example on Java platform there are libraries that can do the job rather efficiently, so that overhead is mostly due to expanded size.
I agree with two earlier answers:
- base64 is simple, commonly used standard, so it is unlikely to find something better specifically to use with JSON (base-85 is used by postscript etc; but benefits are at best marginal when you think about it)
- compression before encoding (and after decoding) may make lots of sense, depending on data you use
Answer by andrej for Binary Data in JSON String. Something better than Base64
If you deal with bandwidth problems, try to compress data at the client side first, then base64-it.
Nice example of such magic is at http://jszip.stuartk.co.uk/ and more discussion to this topic is at JavaScript implementation of Gzip
Answer by DarcyThomas for Binary Data in JSON String. Something better than Base64
BSON (Binary JSON) may work for you. http://en.wikipedia.org/wiki/BSON
Edit: FYI the .NET library json.net supports reading and writing bson if you are looking for some C# server side love.
Answer by Stefano Fratini for Binary Data in JSON String. Something better than Base64
It's very fast to encode, decode and compact
Speed comparison (java based but meaningful nevertheless): https://github.com/eishay/jvm-serializers/wiki/
Also it's an extension to JSON that allow you to skip base64 encoding for byte arrays
Smile encoded strings can be gzipped when space is critical
Answer by Koushik for Binary Data in JSON String. Something better than Base64
Data type really concerns. I have tested different scenarios on sending the payload from a RESTful resource. For encoding I have used Base64(Apache) and for compression GZIP(java.utils.zip.*).The payload contains information about film,an image and an audio file. I have compressed and encoded the image and audio files which drastically degraded the performance. Encoding before compression turned out well. Image and audio content were sent as encoded and compressed bytes [] .
Answer by Dheeraj Sangamkar for Binary Data in JSON String. Something better than Base64
Refer: http://snia.org/sites/default/files/Multi-part%20MIME%20Extension%20v1.0g.pdf
It describes a way to transfer binary data between a CDMI client and server using 'CDMI content type' operations without requiring base64 conversion of the binary data.
If you can use 'Non-CDMI content type' operation, it is ideal to transfer 'data' to/from a object. Metadata can then later be added/retrieved to/from the object as a subsequent 'CDMI content type' operation.
Answer by chmike for Binary Data in JSON String. Something better than Base64
The problem with UTF-8 is that it is not the most space efficient encoding. Also, some random binary byte sequences are invalid UTF-8 encoding. So you can't just interpret a random binary byte sequence as some UTF-8 data because it will be invalid UTF-8 encoding. The benefit of this constrain on the UTF-8 encoding is that it makes it robust and possible to locate multi byte chars start and end whatever byte we start looking at.
As a consequence, if encoding a byte value in the range [0..127] would need only one byte in UTF-8 encoding, encoding a byte value in the range [128..255] would require 2 bytes ! Worse than that. In JSON, control chars, " and \ are not allowed to appear in a string. So the binary data would require some transformation to be properly encoded.
Let see. If we assume uniformly distributed random byte values in our binary data then, on average, half of the bytes would be encoded in one bytes and the other half in two bytes. The UTF-8 encoded binary data would have 150% of the initial size.
Base64 encoding grows only to 133% of the initial size. So Base64 encoding is more efficient.
What about using another Base encoding ? In UTF-8, encoding the 128 ASCII values is the most space efficient. In 8 bits you can store 7 bits. So if we cut the binary data in 7 bit chunks to store them in each byte of an UTF-8 encoded string, the encoded data would grow only to 114% of the initial size. Better than Base64. Unfortunately we can't use this easy trick because JSON doesn't allow some ASCII chars. The 33 control characters of ASCII ( [0..31] and 127) and the " and \ must be excluded. This leaves us only 128-35 = 93 chars.
So in theory we could define a Base93 encoding which would grow the encoded size to 8/log2(93) = 8*log10(2)/log10(93) = 122%. But a Base93 encoding would not be as convenient as a Base64 encoding. Base64 requires to cut the input byte sequence in 6bit chunks for which simple bitwise operation works well. Beside 133% is not much more than 122%.
This is why I came independently to the common conclusion that Base64 is indeed the best choice to encode binary data in JSON. My answer presents a justification for it. I agree it isn't very attractive from the performance point of view, but consider also the benefit of using JSON with it's human readable string representation easy to manipulate in all programming languages.
If performance is critical than a pure binary encoding should be considered as replacement of JSON. But with JSON my conclusion is that Base64 is the best.
Answer by Rolf Rost for Binary Data in JSON String. Something better than Base64
My solution now, XHR2 is using ArrayBuffer. The ArrayBuffer as binary sequence contains multipart-content, video, audio, graphic, text and so on with multiple content-types. All in One Response.
In modern browser, having DataView, StringView and Blob for different Components. See also: http://rolfrost.de/video.html for more details.
Answer by ?lex for Binary Data in JSON String. Something better than Base64
I know this is a nearly 6 year old question but I run into the same problem, and thought I'd share a solution: multipart/form-data.
By sending a multipart form you send first as string your JSON meta-data, and then separately send as raw binary (image(s), wavs, etc) indexed by the Content-Disposition name.
Here's a nice tutorial on how to do this in obj-c, and here is a blog article that explains how to partition the string data with the form boundary, and separate it from the binary data.
The only change you really need to do is on the server side; you will have to capture your meta-data which should reference the POST'ed binary data appropriately (by using a Content-Disposition boundary).
Granted it requires additional work on the server side, but if you are sending many images or large images, this is worth it. Combine this with gzip compression if you want.
IMHO sending base64 encoded data is a hack; the RFC multipart/form-data was created for issues such as this: sending binary data in combination with text or meta-data.
Answer by Martin Kersten for Binary Data in JSON String. Something better than Base64
Just to add the resource and complexity standpoint to the discussion. Since doing PUT/POST and PATCH for storing new resources and altering them, one should remember that the content transfer is an exact representation of the content that is stored and that is received by issuing a GET operation.
A multi-part message is often used as a savior but for simplicity reason and for more complex tasks, I prefer the idea of giving the content as a whole. It is self explaining and it is simple.
And yes JSON is something crippling but in the end JSON itself is verbose. And the overhead of mapping to BASE64 is way to small.
Using Multi-Part messages correctly one has to either dismantle the object to send, use a property path as the parameter name for automatic combination or will need to create another protocol / format to just express the payload.
Also liking the BSON approach, this is not that widely and easy supported as one would like it to be.
Basically we just miss something here but embedding binary data as base64 is well established and way to go unless you really have identified the need to do real binary transfer (which is hardly often the case).
Fatal error: Call to a member function getElementsByTagName() on a non-object in D:\XAMPP INSTALLASTION\xampp\htdocs\endunpratama9i\www-stackoverflow-info-proses.php on line 72
0 comments:
Post a Comment