Integrated Document Search Program (Multi-format Context-Based Document Search Engine)
Search across multiple document types in one unified system
FREE DOWNLOAD https://github.com/applProvide/officeFinder/blob/main/INSTALL.md
• Full support for Excel, Word, PDF, and HWP
• Precise data extraction with group-based conditional search
• High accuracy through keyword + context analysis
• PDF search tool
• document search software
• Excel search program
• IMG search tool(PNG,JPG,JPEG)
👉 A next-generation document search solution that boosts productivity
- Full support for Excel, Word, PDF, PPT, and HWP
- Precise data extraction with group-based conditional search
- High search accuracy through keyword + context analysis
- Goes beyond simple keyword matching by analyzing relationships and context between words
- Search scattered documents in one place
- Advanced search that analyzes the context before and after specific words
- Extracts text from various document formats and integrates them into a single searchable system
1. Overview
This program is an advanced condition-based integrated document search system designed to quickly extract desired information from various document files.
Beyond simple keyword search, it improves accuracy by analyzing the **context (surrounding words)** of specific terms.
2. Key Features
2.1 Group-Based Search System
- Search conditions can be organized into groups
- Each group has independent conditions and is combined during overall search
- Complex conditions can be logically structured
- Supports AND / OR logical expansion
- Highlights search results
2.2 Keyword + Context-Based Search
- Improves accuracy by analyzing the context before and after keywords
- Basic keyword search – checks for the presence of specific terms
- Context expansion search – searches surrounding words based on a keyword
- • Search for the word “contract”
• “contract” + preceding word: “electronic”
• “contract” + following word: “execution”
→ Accurately detects context such as “electronic contract execution” - Reduces false positives
- Provides results closer to actual document content
2.3 Support for Multiple Document Formats
- Excel : .xls, .xlsx
- Word : .doc, .docx
- PDF : .pdf
- PowerPoint : .ppt, .pptx
- HWP (Hangul) : .hwp, .hwpx
- Text : .txt, .xml, .js, .py, .csv, .log, .html, .htm, .css, .java, .cpp, .c, .json, .yaml, .yml, .bat, .sh
- Processes multiple formats within a single search system
- Provides a consistent search experience regardless of file type
2.4 Multi-Condition Combined Search
- Performs complex searches by applying multiple groups simultaneously
- e.g.) (contract-related keywords) AND (includes amount) AND (specific context)
- Extracts only the required information accurately
- Fast filtering even in large volumes of documents
2.5 User Convenience Features
- Intuitive UI-based condition configuration
- Condition retention for repeated searches
- Fast search performance
- Ability to locate matching documents and positions
3. Core Technologies
✔ Context-Based Search Algorithm
- Sentence structure-based search rather than keyword-only matching
- Considers relationships between words
✔ Multi-format Parsing Processing
- Applies data extraction logic for each document format
- Converts into unified text data for integrated searching
4. Use Cases
- 📑 Contract / legal document search
- 📊 Extracting specific content from reports
- 🏢 Internal document integration search system
- 🔎 Keyword-based auditing and verification tasks