searching for text within PDFs /PPTs

Please check the FAQ (https://www.xyplorer.com/faq.php) before posting a question...
Post Reply
vmaniku
Posts: 2
Joined: 04 Mar 2008 15:12

searching for text within PDFs /PPTs

Post by vmaniku »

I'm a new user to xyplorer. So far, I've been very impressed with it, but I'm beginning to be disappointed with the search. Hopefully, I'm just not using the features correctly.

Is there a way to search for text within a PDF or PPT document?

I was using the info panel, and going to the "Containing Text" subtab in Find Files. I went to a directory where I know there is a PDF file with a certain matching text. Search doesn't return any results; I even tried the "Match Unicode" option. Is there some other way?

cheers,
V

j_c_hallgren
XY Blog Master
Posts: 5826
Joined: 02 Jan 2006 19:34
Location: So. Chatham MA/Clearwater FL
Contact:

Post by j_c_hallgren »

Hi and welcome to the XY world!

You are correct that XY is not presently able to locate text within a PDF, mainly because it appears as just a bunch of graphics and not "real" text, which you can see via the Raw View on Info Panel...so any search would have to decrypt/decipher the internal format which a PDF viewer does.
Still spending WAY TOO much time here! But it's such a pleasure helping XY be a treasure!
(XP on laptop with touchpad and thus NO mouse!) Using latest beta vers when possible.

vmaniku
Posts: 2
Joined: 04 Mar 2008 15:12

new feature?

Post by vmaniku »

Thanks for the quick reply!

It should be possible b/c one of the other explorer replacements I looked at before XY was able to search within PDFs and many other formats. I can post this to the requested features forum and maybe it could be added in later.

thanks.

j_c_hallgren
XY Blog Master
Posts: 5826
Joined: 02 Jan 2006 19:34
Location: So. Chatham MA/Clearwater FL
Contact:

Re: new feature?

Post by j_c_hallgren »

vmaniku wrote:Thanks for the quick reply!

It should be possible b/c one of the other explorer replacements I looked at before XY was able to search within PDFs and many other formats. I can post this to the requested features forum and maybe it could be added in later.

thanks.
Yes, one of the things that you'll find here is quick replies (in most cases)...we try to respond to new users promptly because it shows that, while XY may not do everything the competition does, we have great support here! Which can help offset those missing features...

I noticed that xplorer2 has that search feature in the paid/pro vers...so it's possible...but as Don wrote, scripting is the current feature that is being concentrated on, as it will help make XY more unique..he's also generally quite receptive to user requests, and it doesn't have to be in the "Wishes" forum to be acted upon...
Still spending WAY TOO much time here! But it's such a pleasure helping XY be a treasure!
(XP on laptop with touchpad and thus NO mouse!) Using latest beta vers when possible.

lukescammell
Posts: 744
Joined: 28 Jul 2006 13:15
Location: Kent, UK
Contact:

Post by lukescammell »

PDF contents searching would actually be damn useful, I can't think why I didn't ask for it before...
Used to update to the latest beta every day. Now I have children instead…
Windows 10 Pro x64 (everywhere except phone…)

fishgod
Posts: 231
Joined: 03 Feb 2008 00:40
Location: Sankt Augustin (near Bonn), Germany

Post by fishgod »

j_c_hallgren wrote:mainly because it appears as just a bunch of graphics and not "real" text, which you can see via the Raw View on Info Panel...so any search would have to decrypt/decipher the internal format which a PDF viewer does.
But this doesn't apply to all PDF-documents, as far as I know is text also stored in a text/unicode form, why else should it be possible to include font-arts into a PDF-document?
Maybe Don can take a look at it, and make it working for PDF's with "real" text in them.
Operating System: Win10 x64 / Win11 x64 / almost allways newest XY-beta
totally XYscripting-addicted

admin
Site Admin
Posts: 64854
Joined: 22 May 2004 16:48
Location: Win8.1, Win10, Win11, all @100%
Contact:

Post by admin »

fishgod wrote:
j_c_hallgren wrote:mainly because it appears as just a bunch of graphics and not "real" text, which you can see via the Raw View on Info Panel...so any search would have to decrypt/decipher the internal format which a PDF viewer does.
But this doesn't apply to all PDF-documents, as far as I know is text also stored in a text/unicode form, why else should it be possible to include font-arts into a PDF-document?
Maybe Don can take a look at it, and make it working for PDF's with "real" text in them.
XY can search the contents of any file, so this includes PDFs. However it does search the raw, uninterpreted contents -- in case of complex file formats like PDF or Office files, it is likely that XY finds either too much or too less, since the the raw data and the interpreted data differ to a large degree.

To search the interpreted text of such files, one has to use filters (interpreters), that are usually supplied to the OS with the installation of such software. (You guessed it: I won't write my own PDF filter... :wink: )

It's on my list but other things will come first.

serendipity
Posts: 3360
Joined: 07 May 2007 18:14
Location: NJ/NY

Post by serendipity »

That explains why i find some terms inside PDF while not others. dont have a high priority now, but +1 for this feature.

lem
Posts: 10
Joined: 27 Mar 2006 05:24

Re: searching for text within PDFs /PPTs

Post by lem »

I'd like to add another 'pretty please' for enhancing XYplorer to be able to search for text within PDFs :)

graham
Posts: 457
Joined: 24 Aug 2007 22:08
Location: Isle of Man

Re: searching for text within PDFs /PPTs

Post by graham »

try this script to do this search

if necessary change in the script the path for your adobe reader
I set the script up as a catalogue search item
To use select pdf file you want to search, press search script in catalog and script will ask for search words.

Code: Select all

// XYplorer PDF Search 

//change nextline for adobe reader as appropriate
   $adobe = "c:\program files\adobe\reader 9.0\reader\acrord32.exe"; 
 
   $pdf  = <curitem>;
   input $words, "Enter search word(s)list to search (must have space between)", "word1 word2";
   $param = /A "search=$words";
   $command = """$adobe""".$param."""$pdf""";
   open $command;

kartal
Posts: 208
Joined: 14 Aug 2008 18:06

Re: searching for text within PDFs /PPTs

Post by kartal »

You can use free scanfs search software for that.

http://www.saleensoftware.com/ScanFS.aspx

apisto
Posts: 5
Joined: 28 Dec 2008 15:16

Re: searching for text within PDFs /PPTs

Post by apisto »

First let me say that I love XYplorer and am happy I purchased the lifetime liscence, and this is defiantely a "you only hear from me when I have something to complain about" situation.

The lack of XY's ability to text search in "searchable" PDFs is disapointing. I scan in many documents to PDF and go through the effort to run them through OCR so that they become searchable PDFs. For the lack of this functionality in XY I often cannot use XY's wonderful search abilities. Windows explorer in XP, Vista, and I would assume 7 can search for text in searchable PDFs, so this is a step back for XY in that regard.

I do not expect XY to pattern match on non OCR'ed PDFs, but I would really like to see the ability to text search withing PDFs that have already been run through OCR.

Thanks for listening. I hope to see this feature in the future.

admin
Site Admin
Posts: 64854
Joined: 22 May 2004 16:48
Location: Win8.1, Win10, Win11, all @100%
Contact:

Re: searching for text within PDFs /PPTs

Post by admin »

apisto wrote:First let me say that I love XYplorer and am happy I purchased the lifetime liscence, and this is defiantely a "you only hear from me when I have something to complain about" situation.

The lack of XY's ability to text search in "searchable" PDFs is disapointing. ...
Yes, I'm aware of it and will eventually add it.

Post Reply