Page 1 of 1

searching for text within PDFs /PPTs

Posted: 04 Mar 2008 15:27
by vmaniku
I'm a new user to xyplorer. So far, I've been very impressed with it, but I'm beginning to be disappointed with the search. Hopefully, I'm just not using the features correctly.

Is there a way to search for text within a PDF or PPT document?

I was using the info panel, and going to the "Containing Text" subtab in Find Files. I went to a directory where I know there is a PDF file with a certain matching text. Search doesn't return any results; I even tried the "Match Unicode" option. Is there some other way?

cheers,
V

Posted: 04 Mar 2008 16:05
by j_c_hallgren
Hi and welcome to the XY world!

You are correct that XY is not presently able to locate text within a PDF, mainly because it appears as just a bunch of graphics and not "real" text, which you can see via the Raw View on Info Panel...so any search would have to decrypt/decipher the internal format which a PDF viewer does.

new feature?

Posted: 04 Mar 2008 17:00
by vmaniku
Thanks for the quick reply!

It should be possible b/c one of the other explorer replacements I looked at before XY was able to search within PDFs and many other formats. I can post this to the requested features forum and maybe it could be added in later.

thanks.

Re: new feature?

Posted: 04 Mar 2008 17:44
by j_c_hallgren
vmaniku wrote:Thanks for the quick reply!

It should be possible b/c one of the other explorer replacements I looked at before XY was able to search within PDFs and many other formats. I can post this to the requested features forum and maybe it could be added in later.

thanks.
Yes, one of the things that you'll find here is quick replies (in most cases)...we try to respond to new users promptly because it shows that, while XY may not do everything the competition does, we have great support here! Which can help offset those missing features...

I noticed that xplorer2 has that search feature in the paid/pro vers...so it's possible...but as Don wrote, scripting is the current feature that is being concentrated on, as it will help make XY more unique..he's also generally quite receptive to user requests, and it doesn't have to be in the "Wishes" forum to be acted upon...

Posted: 09 Mar 2008 03:33
by lukescammell
PDF contents searching would actually be damn useful, I can't think why I didn't ask for it before...

Posted: 09 Mar 2008 09:11
by fishgod
j_c_hallgren wrote:mainly because it appears as just a bunch of graphics and not "real" text, which you can see via the Raw View on Info Panel...so any search would have to decrypt/decipher the internal format which a PDF viewer does.
But this doesn't apply to all PDF-documents, as far as I know is text also stored in a text/unicode form, why else should it be possible to include font-arts into a PDF-document?
Maybe Don can take a look at it, and make it working for PDF's with "real" text in them.

Posted: 09 Mar 2008 11:37
by admin
fishgod wrote:
j_c_hallgren wrote:mainly because it appears as just a bunch of graphics and not "real" text, which you can see via the Raw View on Info Panel...so any search would have to decrypt/decipher the internal format which a PDF viewer does.
But this doesn't apply to all PDF-documents, as far as I know is text also stored in a text/unicode form, why else should it be possible to include font-arts into a PDF-document?
Maybe Don can take a look at it, and make it working for PDF's with "real" text in them.
XY can search the contents of any file, so this includes PDFs. However it does search the raw, uninterpreted contents -- in case of complex file formats like PDF or Office files, it is likely that XY finds either too much or too less, since the the raw data and the interpreted data differ to a large degree.

To search the interpreted text of such files, one has to use filters (interpreters), that are usually supplied to the OS with the installation of such software. (You guessed it: I won't write my own PDF filter... :wink: )

It's on my list but other things will come first.

Posted: 09 Mar 2008 15:29
by serendipity
That explains why i find some terms inside PDF while not others. dont have a high priority now, but +1 for this feature.

Re: searching for text within PDFs /PPTs

Posted: 17 Jan 2009 23:57
by lem
I'd like to add another 'pretty please' for enhancing XYplorer to be able to search for text within PDFs :)

Re: searching for text within PDFs /PPTs

Posted: 18 Jan 2009 01:28
by graham
try this script to do this search

if necessary change in the script the path for your adobe reader
I set the script up as a catalogue search item
To use select pdf file you want to search, press search script in catalog and script will ask for search words.

Code: Select all

// XYplorer PDF Search 

//change nextline for adobe reader as appropriate
   $adobe = "c:\program files\adobe\reader 9.0\reader\acrord32.exe"; 
 
   $pdf  = <curitem>;
   input $words, "Enter search word(s)list to search (must have space between)", "word1 word2";
   $param = /A "search=$words";
   $command = """$adobe""".$param."""$pdf""";
   open $command;

Re: searching for text within PDFs /PPTs

Posted: 02 Feb 2009 05:01
by kartal
You can use free scanfs search software for that.

http://www.saleensoftware.com/ScanFS.aspx

Re: searching for text within PDFs /PPTs

Posted: 20 Jun 2009 13:34
by apisto
First let me say that I love XYplorer and am happy I purchased the lifetime liscence, and this is defiantely a "you only hear from me when I have something to complain about" situation.

The lack of XY's ability to text search in "searchable" PDFs is disapointing. I scan in many documents to PDF and go through the effort to run them through OCR so that they become searchable PDFs. For the lack of this functionality in XY I often cannot use XY's wonderful search abilities. Windows explorer in XP, Vista, and I would assume 7 can search for text in searchable PDFs, so this is a step back for XY in that regard.

I do not expect XY to pattern match on non OCR'ed PDFs, but I would really like to see the ability to text search withing PDFs that have already been run through OCR.

Thanks for listening. I hope to see this feature in the future.

Re: searching for text within PDFs /PPTs

Posted: 20 Jun 2009 13:54
by admin
apisto wrote:First let me say that I love XYplorer and am happy I purchased the lifetime liscence, and this is defiantely a "you only hear from me when I have something to complain about" situation.

The lack of XY's ability to text search in "searchable" PDFs is disapointing. ...
Yes, I'm aware of it and will eventually add it.