Blog coding and discussion of coding about JavaScript, PHP, CGI, general web building etc.

Monday, January 25, 2016

How do I determine file encoding in OSX?

How do I determine file encoding in OSX?


I'm trying to enter some UTF-8 characters into a LaTeX file in TextMate (which says its default encoding is UTF-8), but LaTeX doesn't seem to understand them. Running cat my_file.tex shows the characters properly in Terminal. Running ls -al shows something I've never seen before: an "@" by the file listing:

-rw-r--r--@  1 me      users      2021 Feb 11 18:05 my_file.tex  

(And, yes, I'm using \usepackage[utf8]{inputenc} in the LaTeX.)

I've found iconv, but that doesn't seem to be able to tell me what the encoding is -- it'll only convert once I figure it out.

Answer by codelogic for How do I determine file encoding in OSX?


The @ means that the file has extended file attributes associated with it. You can query them using the getxattr() function.

There's no definite way to detect the encoding of a file. Read this answer, it explains why.

There's a command line tool, enca, that attempts to guess the encoding. You might want to check it out.

Answer by Keltia for How do I determine file encoding in OSX?


Which LaTeX are you using? When I was using teTeX, I had to manually download the unicode package and add this to my .tex files:

% UTF-8 stuff  \usepackage[notipa]{ucs}  \usepackage[utf8x]{inputenc}  \usepackage[T1]{fontenc}  

Now, I've switched over to XeTeX from the TeXlive 2008 package (here), it is even more simple:

% UTF-8 stuff  \usepackage{fontspec}  \usepackage{xunicode}  

As for detection of a file's encoding, you could play with file(1) (but it is rather limited) but like someone else said, it is difficult.

Answer by jalf for How do I determine file encoding in OSX?


A brute-force way to check the encoding might just be to check the file in a hex editor or similar. (or write a program to check) Look at the binary data in the file. The UTF-8 format is fairly easy to recognize. All ASCII characters are single bytes with values below 128 (0x80) Multibyte sequences follow the pattern shown in the wiki article

If you can find a simpler way to get a program to verify the encoding for you, that's obviously a shortcut, but if all else fails, this would do the trick.

Answer by Will Robertson for How do I determine file encoding in OSX?


Classic 8-bit LaTeX is very restricted in which UTF8 characters it can use; it's highly dependent on the encoding of the font you're using and which glyphs that font has available.

Since you don't give a specific example, it's hard to know exactly where the problem is ? whether you're attempting to use a glyph that your font doesn't have or whether you're not using the correct font encoding in the first place.

Here's a minimal example showing how a few UTF8 characters can be used in a LaTeX document:

\documentclass{article}  \usepackage[T1]{fontenc}  \usepackage{lmodern}  \usepackage[utf8]{inputenc}  \begin{document}  ?Hll??th?r?.?  \end{document}  

You may have more luck with the [utf8x] encoding, but be slightly warned that it's no longer supported and has some idiosyncrasies compared with [utf8] (as far as I recall; it's been a while since I've looked at it). But if it does the trick, that's all that matters for you.

Answer by Jouni K. Seppnen for How do I determine file encoding in OSX?


The @ sign means the file has extended attributes. xattr file shows what attributes it has, xattr -l file shows the attribute values too (which can be large sometimes ? try e.g. xattr /System/Library/Fonts/HelveLTMM to see an old-style font that exists in the resource fork).

Answer by dreamlax for How do I determine file encoding in OSX?


Typing file myfile.tex in a terminal can sometimes tell you the encoding and type of file using a series of algorithms and magic numbers. It's fairly useful but don't rely on it providing concrete or reliable information.

A Localizable.strings file (found in localised Mac OS X applications) is typically reported to be a UTF-16 C source file.

Answer by Tim for How do I determine file encoding in OSX?


Using the -I (that's a capital i) option on the file command seems to show the file encoding.

file -I {filename}  

Answer by bx2 for How do I determine file encoding in OSX?


Just use:

file -I   

That's it.

Answer by Cloudranger for How do I determine file encoding in OSX?


In Mac OS X the command file -I (capital i) will give you the proper character set so long as the file you are testing contains characters outside of the basic ASCII range.

For instance if you go into Terminal and use vi to create a file eg. vi test.txt then insert some characters and include an accented character (try ALT-e followed by e) then save the file.

They type file -I text.txt and you should get a result like this:

test.txt: text/plain; charset=utf-8

Answer by RPM for How do I determine file encoding in OSX?


You can also convert from one file type to another using the following command :

iconv -f original_charset -t new_charset originalfile > newfile  

e.g.

iconv -f utf-16le -t utf-8 file1.txt > file2.txt  

Answer by pi3 for How do I determine file encoding in OSX?


Synalyze It! allows to compare text or bytes in all encodings the ICU library offers. Using that feature you usually see immediately which code page makes sense for your data.

Answer by grokworks for How do I determine file encoding in OSX?


You can try loading the file into a firefox window then go to View - Character Encoding. There should be a check mark next to the file's encoding type.

Answer by Adam for How do I determine file encoding in OSX?


Using file command with the --mime-encoding option (e.g. file --mime-encoding some_file.txt) instead of the -I option works on OS X and has the added benefit of omitting the mime type, "text/plain", which you probably don't care about.

Answer by gnuchu for How do I determine file encoding in OSX?


I'm lazy. I just use sublime text to switch encodings.

Answer by jmettraux for How do I determine file encoding in OSX?


vim -c 'execute "silent !echo " . &fileencoding | q' {filename}  

aliased somewhere in my bash configuration as

alias vic="vim -c 'execute \"silent !echo \" . &fileencoding | q'"  

so I just type

vic {filename}  

On my vanilla OSX Yosemite, it yields more precise results than "file -I":

$ file -I pdfs/udocument0.pdf  pdfs/udocument0.pdf: application/pdf; charset=binary  $ vic pdfs/udocument0.pdf  latin1  $  $ file -I pdfs/t0.pdf  pdfs/t0.pdf: application/pdf; charset=us-ascii  $ vic pdfs/t0.pdf  utf-8  


Fatal error: Call to a member function getElementsByTagName() on a non-object in D:\XAMPP INSTALLASTION\xampp\htdocs\endunpratama9i\www-stackoverflow-info-proses.php on line 72

0 comments:

Post a Comment

Popular Posts

Powered by Blogger.