pdfsearch (0.1.dev9+g9be130d)

Published 2024-09-05 17:00:34 +00:00 by calipp

Installation

pip install --index-url  pdfsearch

About this package

PDFSearch

PDFSearch is a small utility that mostly acts as front end to various search engines for (mostly) PDF-files.

Table of Contents

Features

The default search engines implemented are

  • pdfgrep
  • recoll
  • ag / grep (ag: The Silver Searcher with a fallback to grep if ag is not installed)
  • find

A preview for the selected search result is shown, if the file is identified as pdf, text-based files or images.

Search results are shown in a list with a little bit of context. Using the arrow keys, the next/prev search results can be selected and the corresponding file will be previewed at the found page (or position for text files).

It also implements file tagging and adding comments to files (entire files, not pdf annotations).

Installation

git clone <this repository> pdfsearch
cd pdfsearch
pip install .

Usage

After installing PDFSearch the pdfsearch command will be available in the commandline. (If not, restart your shell and/or add ~/.local/bin to your PATH variable)

When starting PDFSearch, an empty preview and empty result list is shown with a search input on the top left.

Enter the desired search term and press enter or the search button. The default search location is ~/Documents but this can be changed both permanently and temporarily.

To temporarily change it, there are two options, either use the File->Change Directory menu or use View-Toggle Directory Chooser (or drag the left edge to the center of the window) to reveal a file tree and select the directory to search (The selection is not the 'deepest' directory, it is the one that is highlighted).

To permanently change the initial search directory, see Configuration.

It is possible to search with one or multiple search engines. To select the desired engines, use the toolbar at the top of the window. Select or deselect the ones you want.

Configuration

There are some configuration options to customise PDFSearch. The configuration file is located in ~/.config/pdfsearch.cfg

They should be rather obvious from their name, but the most important ones are listed here:

[Search]
timeout: The timeout for search engines. 
         After this timeout the engines kill the program (e.g. pdfgrep) 
         and the results until this point will be shown.
engines: A comma separated list of engines that are selected at the beginning.
path: The initial search path when opening PDFSearch.

[Results]
limit_num: Set this to an integer to limit the number of results to show in the list. (default: 100)
           (This does not affect the search engines, which will search the entire directory regardless)

[Optics]
dirtree_shown: Show the directory tree at startup
tags_shown: Show the tag page at startup

To get the default configuration file use

pdfsearch --write-default-config

and check ~/.config/pdfsearch.cfg. Note: This will fail if your config file does exist and is not empty. You can create a fresh config file with default values in another location with

pdfsearch -c <path-to-configfile> --write-default-config

which will create the config file specified in <path-to-configfile>.

License

pdfsearch is distributed under the terms of the MIT license.

Requirements

Requires Python: >=3.7
Details
PyPI
2024-09-05 17:00:34 +00:00
2
28 KiB
Assets (1)
Versions (9) View all