In General

This solution is about the format of your analysis file. For information about API-related file formats (JSON), please have a look at the Gavagai Explorer API documentation.


To create a project in Gavagai Explorer, you will be prompted to upload a file in XLSX or CSV format. The format of your file should be a row and column format similar to the format of a spreadsheet in Excel or similar programs.


The first row of the file should be a header row. The header row should contain meaningful names of the columns since you will need to reference the columns from inside Explorer.


On each row should be all the data for one respondent. This must include the text in one column but it could include several more columns containing things like ID, meta-data, demographic data, and so on, for this respondent. 

You can use these additional columns to filter your data.


All respondents' data should be in the same column format so that each respondent's text is in the same column for each row.


The text column you will choose to analyze should ideally be in only one language.


If you are going to analyze survey responses you should only have the answers to one question in a column. This will correspond to one project in the Explorer. The reason is that the Explorer will find the common topics and themes in the text and having several types of answers in the column will decrease the quality of the analysis.


File size

Your file should have at least five rows plus the header row. The Explorer needs at least that much data to be able to perform a meaningful analysis.

The file should not be bigger than 107 MB and contain more than 250 thousand rows.


Excel files

It is important that the data you would like to upload to Explorer is on the first sheet in your Excel file and that it contains a header row, with informative titles such as "text", "customer reference" etc. For Excel files containing multiple sheets, only the first sheet is processed by the Explorer, so be sure to arrange or split the file accordingly.


CSV files

You can use a tool like http://csvlint.io/ to check if your file is a proper CSV. You can use the "Save as" functionality in Excel to choose CSV format. Another option is to upload the file to Google drive, open it as a Google spreadsheet and then download it as a CSV (File → Download as → CSV). A third option is to open your original file in LibreOffice and save as CSV (this has the added benefit of enabling you to change the encoding if you want). All of these options give you good CSV files that adhere to the standard about which you can read more here: https://en.wikipedia.org/wiki/Comma-separated_values.


If you export from Excel you should note that you need to make the file comma-separated and not semicolon-separated.


For more information on file formats see the Preparing an upload file section of the documentation.