App or technology to search unstructured JSON files?
New here? Learn about Bountify and follow @bountify to get notified of new bounties! x

I have a requirement for a web based interface/app to search tens of thousands of JSON files.
The files contain heavily nested data, and are varied and unpredictable in format/structure.

I've been able to create a simple search app using the amazing DataTables plugin (datatables.net) and "client side processing". This works to a point, but with the sheer number of files, this isn't going to work without some sort of database to handle the search processing.

So, whilst I'd normally use MySQL (which works nicely with DataTables), this case probably requires something better equipped to deal with storing unstructured data (specifically JSON), so naturally I've looked towards NoSQL solutions such as MongoDB.

"Out of the box", DataTables doesn't support MongoDB, but should be able to work fine with it, as it only requires valid JSON as a source, and a database server that can handle returning this if you're using it with server side processing.
So there are methods like this:

https://github.com/leblanchannah/mongodb-datatables-serverside

And tutorials like this:

https://carlofontanos.com/mongodb-ajax-pagination-with-search-and-sort-using-php/

(I'm most comfortable with PHP, but I'm not against using something else like node.js/python etc. if that's more suitable for the job).

So I have two questions here (for the bounty):

1) Is it worth me "reinventing the wheel" for this, and putting the effort into writing this custom app that allows advanced searching of JSON files, and is MongoDB and DataTables really the best option?

I'd be surprised if there's nothing already out there that deals with this task - An open source web based solution that allows customised searching of unstructured JSON files, with an easy to use, configurable search interface?
Am I missing something obvious that already exists to do this, without me putting the time into writing something heavily customised?

2) How would DataTables deal with fields in the unstructured data that it doesn't even know about?
For example, if I need to be able to insert:

{
"fruit" : "Apple",
"size" : "Large",
"color" : "Red"
}

And say: Search for "Apple" within "fruit", but then perhaps go on to insert:

{
"size" : "Large",
"color" : "Brown",
"petinfo" : {
"Animal" : "Dog",
"Characteristics" : [
"Fluffy",
"Crossbreed"
],
"Food" : "Dog Biscuits"
}
}

And be able to search for "dog" within "animal".
Whatever system is used basically needs to be adaptable to searching on unpredictable data and structures.

Is this even possible? Or should I be looking at something other than DataTables to provide that level of flexibility?
I just like DataTables because of the instant "as you type" search results, but I'm open to other suggestions.

Sorry if this all sounds a bit vague, but essentially to summarise:

I just want to know if there's already an existing open source (or even paid for) system that allows easy, detailed searching of unstructured, unpredictable JSON data, or should I write my own?
And if I do roll my own, what could be the best technologies to use for doing this?

14 days ago

Crowdsource coding tasks.

1 Solution


well, i believe that datatables (front search) shouldn't force the backend search mechanism.

In your case, mongoDB is the best choice as a database. For server side, spring boot (java web app), node Js and python are nice and they all seems a pretty solution. from my experience, i recommend using either spring boot (mostly) and nodeJs. In String boot you could get all the feature of sorting, filtering, paging with a single method. here's an example passing it as a controller method parameter of paging+sorting:

@GetMapping("/api/my-filter-method")
public ResponseEntity<Page<MyCustomObject>> filterMyCustomObjects(Pageable pageable) {
..............
}

This sorting+paging built-in method is really powerful! with no headaches, you could access all the service side sorting+paging with 0 hard work by calling this url:

http://localhost/api/my-filter-method?&sort=color,desc&sort=fruit,asc&size=10&page=3&.......

Also, you could reach the filtering following other solutions. i recommend following these:

https://www.baeldung.com/rest-api-search-language-spring-data-specifications

https://leaks.wanari.com/2018/11/20/solutions-for-a-filterable-sortable-pageable-list-in-spring

(feel free to find other solutions as these are many. )

keywords: filterable sortable pageable rest api

PS: i'm not saying that Spring Boot is the most powerful solution/ framework, but it's really good and handy when working with java 9 language features. :)

enjoy! ;)

Thanks. Useful to know that you think Mongo is the right way, and you don't think anything else exists that already does what I want. Spring boot looks pretty alien though to be honest... I think the learning curve would be too big for me.
BSUK 21 days ago
Yes, I agree too. For a php developer, learning spring boot is a life changer....
Chlegou 21 days ago