Chat with Documents
The Next Step in Information Processing
Instead of relying on public datasets and general knowledge, "Chat with Documents" generates context-specific answers and analyses based on your trusted internal sources. Upload your documents and use them as a basis for answering questions in the chat!
Solving Data Limitations
When asking questions to a language model, you depend on the dataset with which the model is trained. This is generally information retrieved from the internet. Non-public sources are likely not in this dataset. By using your documents as a source for the chat, you ensure that the model has the information you need to answer your questions.
Possibilities with Your Documents
You can ask questions about your documents such as listing the main points of a document or summarizing the document. You can also have the language model perform specific analyses using your own dataset.
Drawbacks of Document-Based Chatting
Uploading and processing documents are extra steps you don't need to take if you can get good answers without the context of specific information. It also takes longer to generate an answer because the necessary information must first be retrieved from the document before the request can be sent to the language model.
Behind the Scenes of Chatting with Documents
The text from the documents you upload is extracted and divided into chunks. These chunks have a fixed number of characters (1024 characters), and we have also set an overlap (128 characters) between the chunks. Each text chunk is stored as a vector in a vector database. For each question, a selection is made from this data based on similarity to the question being asked.
Document Fragment Selection Process
The text pieces are already converted into vectors. Vectors have multiple dimensions that indicate how "similar" this text is to other text. Think of the RGB color system. A color with a similar RGB value is also a similar color but slightly different. The vector database allows us to retrieve text chunks in a ranked and filtered manner based on the question being asked. We select a maximum of 100 text chunks of 1024 characters to send along with the question.
Suitable Models for Document-Based Chatting
We have selected models with a large context window to enable chatting with documents. We want to be able to send up to 100 text chunks of 1024 characters. This is more than 100,000 characters. Models like GPT 3.5 cannot process that much text. Therefore, we recommend using this feature only in combination with GPT-4.1, Gemini 2.5 Pro, and Claude 4.0.
Suitable models are GPT-4.1, Gemini 2.5 Pro, and Claude 4.0.
Select One or Multiple Documents
You can turn on file mode by clicking the paperclip on the right side of the question bar. You can choose up to 10 files to chat with.
When you start chatting with documents, it is checked whether the language model is suitable for chatting with documents. If this is not the case, GPT-4o is automatically selected.
You chat with these documents as long as the file mode is on.
Supported File Types
AI-School supports various file types for chatting with documents:
- PDF files ending in .pdf
- Word files ending in .docx
- CSV files ending in .csv
- JSON files ending in .json
- Text files ending in .txt
- Audio and video files with the extensions 'mp3', 'mp4', 'mpeg', 'mpga', 'm4a', 'wav', or 'webm'
Chatting with Audio or Video Files
For chatting with audio or video files, AI-School uses OpenAI's Whisper model.
After text extraction, we run the text through GPT-4o to check and correct punctuation and spelling.
Then follows the same procedure as extraction from PDF or Word documents.
Whisper has a limit of 25 MB per audio or video file. We therefore apply the same limit when uploading new files.