In General

To input data into a project in Explorer you can use a file or you can use the API. This text is about the format of the file. For information about the API you should look elsewhere in our knowledge base.


The format of the file should be a row and column format similar to the format of a spreadsheet in Excel or similar programs.


The first row of the file should be a header row. The header row should contain meaningful names of the columns since you will need to reference the columns from inside Explorer.


On each row should be all the data for one respondent. This must include the text in one column but it could include several more columns containing things like ID, meta-data, demographic data, and so on, for this respondent. 

These extra columns you can use to filter your analysis (http://docs.gavagai.io/#232-filter-conditions-by-metadata-optional).


All respondents' data should be in the same column format so that each respondent's text is in the same column for each row.


The text column you will choose to analyze should be in only one language.


If you are going to analyze survey responses you should only have the answers to one question in a column. This will correspond to one project in the Explorer. The reason is that the Explorer will find the common topics and themes in the text and having several types of answers in the column will decrease the quality of the analysis.


File size

Your file should have at least five rows plus the header row. The Explorer needs at least that much data to be able to perform a  meaningful analysis.

The file should not be bigger than 107 MB and contain more than 250 thousand rows.


Excel files

It is important that the data you would like to upload to Explorer is on the first sheet in your Excel file and that it contains a header row, with informative titles such as "text", "customer reference" etc. For Excel files containing multiple sheets, only the first sheet is processed by the Explorer, so be sure to arrange or split the file accordingly.


CSV files

You can use a tool like http://csvlint.io/ to check if your file is a proper CSV. You can Press Save as in Excel to choose CSV format. Another option is to upload the file to Google drive, open it as a Google spreadsheet and then download it as a CSV (File → Download as → CSV). A third option is to open your original file in LibreOffice and save as CSV (this has the added benefit of enabling you to change the encoding if you want). All of these options give you good CSV files that adhere to the standard about which you can read more here: https://en.wikipedia.org/wiki/Comma-separated_values.


If you export from Excel you should note that you need to make the file comma-separated and not semicolon-separated.


For more information on file formats see section 2.1.1 of the documentation.