pdfsearch (0.1.dev11+g03842be.d20240922)

Published 2024-09-22 16:29:36 +00:00 by calipp

Installation

pip install --index-url  pdfsearch

About this package

PDFSearch

PDFSearch is a small utility that mostly acts as front end to various search engines for (mostly) PDF-files.

Table of Contents

Features

  • Multiple Search Engines (possibly at once)
    Currently implemented:

    • pdfgrep
    • recoll
    • ag / grep (ag: The Silver Searcher with a fallback to grep if ag is not installed)
    • find
  • File Preview
    A preview for the selected search result is shown, if the file is identified as pdf, text-based files or images.

  • Context
    Search results are shown in a list with a little bit of context.

  • Easy Navigation
    Using the arrow keys, the next/prev search results can be selected and the corresponding file will be previewed at the found page (or position for text files).

  • File Tagging and Commenting
    It also implements file tagging and adding comments to files (entire files, not pdf annotations).

Installation

git clone <this repository> pdfsearch
cd pdfsearch
pip install .

Usage

Opening:
After installing PDFSearch the pdfsearch command will be available in the commandline. (If not, restart your shell and/or add ~/.local/bin to your PATH variable)

Overview:
When starting PDFSearch, an empty preview and empty result list is shown with a search input on the top left.

Searching:
Enter the desired search term and press enter or the search button.

Search Location:
The default search location is ~/Documents but this can be changed both permanently and temporarily.

To temporarily change it, there are two options, either use the File->Change Directory menu or use View-Toggle Directory Chooser (or drag the left edge to the center of the window) to reveal a file tree and select the directory to search (The selection is not the 'deepest' directory, it is the one that is highlighted).

To permanently change the initial search directory, see Configuration.

Search Engines:
It is possible to search with one or multiple search engines (search programs). To select the desired engines, use the toolbar at the top of the window. Select or deselect the ones you want.

Tagging:
The tagging feature can be used if the tag panel is shown (View->Toggle Tag List or F4 if it is not shown by default), There are two lists, the meaning of the top list can be switched between show "All Tags" which means all tags known to the program are listed and "Filter Files" which lets you filter the entire result list (or files for a tag - list). Clicking on a tag when the toggle is on the left (All Tags) will show all the files for this tag. Clicking on a tag when the toggle is on the right (Filter Files) will filter the list for the given tag.

The lower list shows the tags associated with the currently selected file.

Tags can be renamed (double click on a tag name) and removed (rename it to empty).

To create a new tag, use the input field below the lists and press enter.

Comments:
To view/edit/create comments for a file, select the file in the left result list and enter the comments in the field at the bottom of the tagging panel.

Configuration

There are some configuration options to customise PDFSearch. The configuration file is located in ~/.config/pdfsearch.cfg

They should be rather obvious from their name, but the most important ones are listed here:

[Search]
timeout: The timeout for search engines. 
         After this timeout the engines kill the program (e.g. pdfgrep) 
         and the results until this point will be shown.
engines: A comma separated list of engines that are selected at the beginning.
path: The initial search path when opening PDFSearch.

[Results]
limit_num: Set this to an integer to limit the number of results to show in the list. (default: 100)
           (This does not affect the search engines, which will search the entire directory regardless)
show_results_status: Set this to on or off to toggle between showing leds above the result list
                     that indicate if the results are the search or files for a tag and if the list is
                     filtered by a tag
revert_to_search_results: Set this to True or on if the result list should revert to search results when
                          the "All Tags / Filter Files" toggle is moved from All to Filter.
                          If this is off the list can filter the files for the previously selected tag.

[Optics]
dirtree_shown: Show the directory tree at startup
tags_shown: Show the tag page at startup

To get the default configuration file use

pdfsearch --write-default-config

and check ~/.config/pdfsearch.cfg. Note: This will fail if your config file does exist and is not empty. You can create a fresh config file with default values in another location with

pdfsearch -c <path-to-configfile> --write-default-config

which will create the config file specified in <path-to-configfile>.

License

pdfsearch is distributed under the terms of the MIT license.

Requirements

Requires Python: >=3.7
Details
PyPI
2024-09-22 16:29:36 +00:00
0
58 KiB
Assets (2)
Versions (9) View all