Show HN: Index and search *all* your documents Hey HN! I've build a simple tool to index and search your documents. This uses two great open source libraries: apache tika (for extracting content from docs) and apache lucene (for searching). It's been built with kotlin ktor as a web framework. You can index all kind of files (i.e doc, docx, xls, ppt, pdf, txt, html even ORC pdfs) and then search them using very advanced queries like "always contain X", "never contain X", "X near Y", wildcard search, proper stemming support etc. We're using it on my work where we have hundreds of thousands of doc/docx/pdf files and it works flawlessly! https://ift.tt/pM2KqRZ August 11, 2024 at 12:14AM
Ut wisi enim ad minim veniam, quis nostrud exerci tation ullamcorper suscipit lobortis nisl ut aliquip ex ea commodo consequat. Duis autem vel eum iriure dolor in hendrerit in vulputate velit esse molestie consequat.
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation test link ullamco laboris nisi ut aliquip ex ea commodo consequat.
Duis aute irure dolor in reprehenderit in voluptate another link velit esse cillum dolore eu fugiat nulla pariatur.
Sample Text
10 Lorem Ipsum is simply dummy text of the printing and typesetting industry. Lorem Ipsum has been the industry's standard dummy text.
0 Comments