Is there a way to detect UTF off of a URL?

Please check the FAQ (https://www.xyplorer.com/faq.php) before posting a question...
Post Reply
SkyFrontier
Posts: 2341
Joined: 04 Jan 2010 14:27
Location: Pasárgada (eu vou!)

Is there a way to detect UTF off of a URL?

Post by SkyFrontier »

Title says it all.
I'm wondering on a way XY scripting can detect encoding of a url content without having to readurl() it, then writing a file beforehand so filetype() can do the job.
Thanks.
New User's Ref. Guide and Quick Setup Guide can help a bit! Check XYplorer Resources Index for many useful links!
Want a new XYperience? XY MOD - surfYnXoard
-coz' the aim of computing is to free us to LIVE...

Marco
Posts: 2354
Joined: 27 Jun 2011 15:20

Re: Is there a way to detect UTF off of a URL?

Post by Marco »

Mmh, let's see if I get this correctly.
The HTML code of www.xyplorer.com page starts as follows:

Code: Select all

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN"  "http://www.w3.org/TR/html4/loose.dtd">
<html><head><title>XYplorer - A Windows File Manager and Explorer Replacement</title>
  <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
  <meta http-equiv="Content-Language" content="en">
You are interested in getting the info on the third line? If so then no, the only way is reading the HTML code, since the HTTP response (which you can inspect by going here http://web-sniffer.net/ ) doesn't contain info about that.
Tag Backup - SimpleUpdater - XYplorer Messenger - The Unofficial XYplorer Archive - Everything in XYplorer
Don sees all [cit. from viewtopic.php?p=124094#p124094]

SkyFrontier
Posts: 2341
Joined: 04 Jan 2010 14:27
Location: Pasárgada (eu vou!)

Re: Is there a way to detect UTF off of a URL?

Post by SkyFrontier »

Thanks, Marco.
As XY can via writefile()
t: [default] text; auto-detects whether text can be written as ASCII or needs to be written as UNICODE
I thought it should be a way to produce sample of (online or offline) documents written in original sources' encoding as the clipped section may not contain a special character thus not triggering writefile auto-detection.

In case of online sources, writing a local file then using readfile([numbytes]), then writefile, then deleting the source document is so CPU/HDD intensive that I scrapped the whole method.

Detection of ~charset=iso-8859-1~ can be done, but that's not accurate on the scenario I describe, see?
New User's Ref. Guide and Quick Setup Guide can help a bit! Check XYplorer Resources Index for many useful links!
Want a new XYperience? XY MOD - surfYnXoard
-coz' the aim of computing is to free us to LIVE...

Marco
Posts: 2354
Joined: 27 Jun 2011 15:20

Re: Is there a way to detect UTF off of a URL?

Post by Marco »

Maybe isunicode() can help you? [pag. 348 of the help]
Tag Backup - SimpleUpdater - XYplorer Messenger - The Unofficial XYplorer Archive - Everything in XYplorer
Don sees all [cit. from viewtopic.php?p=124094#p124094]

SkyFrontier
Posts: 2341
Joined: 04 Jan 2010 14:27
Location: Pasárgada (eu vou!)

Re: Is there a way to detect UTF off of a URL?

Post by SkyFrontier »

Marco wrote:Maybe isunicode() can help you? [pag. 348 of the help]
It could, but it doesnt support URLs parsing... :roll:
New User's Ref. Guide and Quick Setup Guide can help a bit! Check XYplorer Resources Index for many useful links!
Want a new XYperience? XY MOD - surfYnXoard
-coz' the aim of computing is to free us to LIVE...

Marco
Posts: 2354
Joined: 27 Jun 2011 15:20

Re: Is there a way to detect UTF off of a URL?

Post by Marco »

You could do something like

Code: Select all

$code=readurl("desiredurl.com");
isunicode($code);
and check what isunicode returns. Keep in mind that the would html code must flow somewhere in the code. Even if isunicode could handle urls it still would have to make http requests and then check the whole code. As of now it simply is visible to the user.
Tag Backup - SimpleUpdater - XYplorer Messenger - The Unofficial XYplorer Archive - Everything in XYplorer
Don sees all [cit. from viewtopic.php?p=124094#p124094]

SkyFrontier
Posts: 2341
Joined: 04 Jan 2010 14:27
Location: Pasárgada (eu vou!)

Re: Is there a way to detect UTF off of a URL?

Post by SkyFrontier »

:!:A combination of redurl+isunicode may solve the problem, Marco. Ill give this one another try. Thanks!
(dumb me...)
New User's Ref. Guide and Quick Setup Guide can help a bit! Check XYplorer Resources Index for many useful links!
Want a new XYperience? XY MOD - surfYnXoard
-coz' the aim of computing is to free us to LIVE...

SkyFrontier
Posts: 2341
Joined: 04 Jan 2010 14:27
Location: Pasárgada (eu vou!)

Re: Is there a way to detect UTF off of a URL?

Post by SkyFrontier »

Marco wrote:You could do something like

Code: Select all

$code=readurl("desiredurl.com");
isunicode($code);
and check what isunicode returns. Keep in mind that the would html code must flow somewhere in the code. Even if isunicode could handle urls it still would have to make http requests and then check the whole code. As of now it simply is visible to the user.
Yes, just had the same idea... Will give it a go.
New User's Ref. Guide and Quick Setup Guide can help a bit! Check XYplorer Resources Index for many useful links!
Want a new XYperience? XY MOD - surfYnXoard
-coz' the aim of computing is to free us to LIVE...

Post Reply