sails 5 years ago

This is great.

Things to keep on your radar:

Meltano spun out of Gitlab is working in this domain, I think they are making progress. https://meltano.com/

dbt are building the transformation engine that Meltano is using (I think), and are worth keeping an eye on https://www.getdbt.com/

In my experience the issue with this domain isn't the "one-off" analysis, but rather orchestrating the BI function across the business, maintaining the single source of truth, testing and deploying across the ETL/ELT layers. I can't speak to how well Ananas is managing these but Meltano and dbt are giving this area a lot of thought.

kfk 5 years ago

This looks good. One of your competitors charges about $4,000 per seat per year so this seems to be a good space. If you add the possibility of building user defined nodes with Python you’d have a solid product.

  • bhou 5 years ago

    We will release a user-defined function with Javascript in next release. Python is on the roadmap too.

    • kfk 5 years ago

      Will you have a server too? Another issue is scheduling, triggers, etc.. You are in this strange space between Excel and “enterprise” ETL like Talend. We are so frustrated with the current poor state of data prep that we went to Python based workflows with the idea of scheduling, triggering, via Airflow. Seems to work well so far. But your IDE for data prep is a bit better positioned than what we use now (Jupyter).

      • millboh 5 years ago

        Actually, we provide a command-line interface, you can do pretty much everything the UI does with command-line. And you can also run it in server mode with command-line. In the current version, we haven't covered scheduling yet. A possible solution would be scheduling through Airflow BashOperator in airflow. Or we can implement scheduling solution into the server itself.

  • justaguyhere 5 years ago

    woah, that is steep. who is this competitor?

    • alienreborn 5 years ago

      Probably Informatica or Talend which are two of the most popular Enterprise ETL tools.

    • ZeroCool2u 5 years ago

      Alteryx would be my guess.

      • kfk 5 years ago

        Interesting enough both ananas here and Alteryx use a declarative approach. Alteryx is using XML and Ananas YAML

        • ZeroCool2u 5 years ago

          The YAML approach seems far more appealing to me honestly, though I'm sure XML was a sane choice at the time.

dewey 5 years ago

Pretty similar to https://github.com/getredash/redash from a first look. What would you say are the main differences?

  • kfk 5 years ago

    Redash is comparable to Superset, Tableau, etc.. Ananas analytics is comparable to Talend, Airflow, etc.

  • millboh 5 years ago

    It seems that Redash is a BI tool and close source. Ananas is open source, and can be used not only as a BI tool, but also an ETL tool. More over, you can run your pipeline on your own infrastructure, as Ananas can be run on multiple execution engines

    • codetrotter 5 years ago

      Parent commenter linked to their GitHub repo. Redash is open source under the terms of the two-clause BSD license. (Maybe they edited to add link, maybe you overlooked it?)

      • millboh 5 years ago

        Oh, sorry about that, my bad. I just looked through redash home page, but didn't find the open source link

        • dewey 5 years ago

          The Github link is in the footer of the site, but the icon is very small.

        • djmips 5 years ago

          redash source <- search

mtarnovan 5 years ago

This looks a bit like https://metabase.com -- anyone used both and can make a comparison?

  • wiremine 5 years ago

    Curious if anyone has used Metabase for serious work and can comment on it. I tried setting it up and got frustrated pretty quickly with the UX. It looks slick, but the mental model was confusing...

    • btown 5 years ago

      My startup seriously evaluated Metabase, but inevitably you'll want joins, and Metabase is fundamentally opposed to the idea: https://metabase.com/blog/Joins/ ... and if you're writing SQL for your views, you might as well be writing SQL.

      We ended up shelling out for Tableau - it's pricey at $840/yr, but it supports joins out of the box (it even has a drag-and-drop interface to set up joins!), has practically every bell and whistle you could ask for, and allows you to do "exploratory analysis over screenshare in realtime with non-technical colleagues, without context switching to a coding mindset or needing to look up field names you may never have used before." I think it's intuitive and worth every penny, but YMMV. Would recommend everyone try the public version to get a feel for it.

      EDIT: as others have said, Ananas is actually ETL + BI, whereas Metabase and Tableau are BI on top of a database. Tableau can stand in for good ETL due to its join support in certain scenarios. It's better than Metabase, but not necessarily comparable to Ananas.

      • tlrobinson 5 years ago

        (I work at Metabase, and am working on the joins feature as we speak)

        As getsauce mentioned, we're adding joins to the next release, as well as essentially subqueries in what we're calling the "notebook". That should unlock a lot of power.

        AMA.

        • fegul 5 years ago

          Big fan of Metabase, thanks for all your work!

          I've always been curious if there is a feature in the works for charting or displaying comparisons across defined timespans (e.g. total page views this week vs. last week)

    • schlowmo 5 years ago

      I really wanted to like Metabase but unfortunately it's way behind its promise. I made an attempt to use it in a customer project for creating very basic customizable dashboard-like statistics, with little to no success. Just to name a few pain points:

      - using the docker image is easy; not so using the jar file especially due to very little documentation

      - confusing UI paired with lack of extensive documentation

      - this makes it far from "easy" to be used by "everyone in your company" (quotes from metabase marketing claim): you really have to know how to do things, even easy ones like changing labels

      - the UI contains many minor bugs which sometimes lead to unsavable metrics and you have just start from scratch

      - no build-in way to export dashboards, which makes it nearly impossible to test your new metrics on a different system before pushing it to production; if you really want to do this, you have to juggle with database dumps

      There might be some valid use cases for metabase, but I don't think it's very usable for non-technical users. I strongly suggest to evaluate it thoroughly before counting on it.

      Nevertheless thanks for making it open source and free to use, so don't get me wrong.

    • sails 5 years ago

      Metabase is pretty good if you have a nicely configured datawarehouse (Snowflake, Bigquery good options). If you are connecting metabase directly to your app database then you will probably run into issues trying to integrate another data set (say CRM data).

      This is where it makes sense to ELT (extract, load, transform) everything into a datawarehouse, integrate the data there and transform as much as possible, and do the "last-mile" analysis in Metabase.

      This is at least the theory, I've had reasonable results with metabase doing it this way, also nice in that the bulk of your logic sits in your datawarehouse, so a BI tool migration is less painful, and also possible to run dual analytics tools.

      Checkout https://www.getdbt.com/ for more on the process.

    • jesterson 5 years ago

      We use it for live and love it. Also new few companies who do use it on fairly big data flow.

      I am not sure why you are frustrated with UI, myself and colleagues find it quite good.

    • mazameli 5 years ago

      Sorry to hear that (I'm the UX guy at MB). Would appreciate your thoughts/feedback on any specifics you want to share. Thanks!

      • fegul 5 years ago

        I'm a fan of your work! We use Metabase pretty frequently.

        The only nitpicks I have are around the concept of Metrics (still not quite sure what those are or how they're useful for me) and the initial download size of the libraries takes quite a while (especially over unstable VPN links)

        I'm wondering if there's a way to have an option where it tries an external CDN first and then falls back to loading from the hosting server.

      • wiremine 5 years ago

        Thanks for following up! Here's some of things that tripped me up:

        1. The lack of in depth docs.

        2. The set up and usage of metrics was focusing. This was the main use case I was hoping Metabase could help me with, and it felt like an addon feature.

        3. For whatever reason, managing dashboards was really confusing, and the UI [1] didn't seem to match the docs.

        [1] I was using the mac version.

        • mazameli 5 years ago

          Thanks, I appreciate your feedback.

      • llampx 5 years ago

        Why do charts with dates on the x-axis not show up correctly? For example the chart will show Jan 2019 on the label underneath a column that is actually Feb 2019. This is confusing to new users and drives experienced users up the wall. Currently the only thing that fixes it is to convert the axis to categorical.

  • mjirv 5 years ago

    Metabase is much more for data visualization. As far as I know it doesn't really have any ETL features.

    I could actually see this being pretty useful as an ETL layer to go along with Metabase if someone were trying to build a free/open source BI stack.

programbreeding 5 years ago

Just FYI, about 1/3 of the way down your Getting Started page[0] it has a broken link[1] to the fifa2019.csv file. The first link on the page is valid[2], but the second one leads to a 404 due to pointing to .../raw/... rather than .../blob/...

[0] https://ananasanalytics.com/docs/user-guide/getting-started

[1] https://github.com/ananas-analytics/ananas-examples/raw/mast...

[2] https://github.com/ananas-analytics/ananas-examples/blob/mas...

  • bhou 5 years ago

    Thanks for the reminder, the links are fixed now.

nishkalkashyap 5 years ago

I would recommend code-signing the build before distributing.

  • yazan94 5 years ago

    I'm not super familiar with code signing, but if alternatives are expensive, could OP maintain a checksum value on their download page rather than go with DigiCert or alternative services? Or does code-signing solve a different problem?

    • nishkalkashyap 5 years ago

      No. Code signing is very different. Checksum would only work for developers on linux. Without code signing certificate, MacOS would straight refuse to run the app and windows will show an 'Unverified publisher' warning. Also things like auto-updates do not work on either platform unless you code sign your binaries.

  • millboh 5 years ago

    Thanks for your feedback, we will look for some affordable code-signing certificates. Any suggestions? By the way, here is the issue link: https://github.com/ananas-analytics/ananas-desktop/issues/61

    • pimterry 5 years ago

      I set up code signing for an electron app relatively recently. Best option I could find was Digicert. Really sucks that this stuff is necessary nowadays and not free, but it's not so bad.

      That's for Windows - for Mac you'll also need an Apple developer account, afaik they're the only people who can issue certs.

      EDIT: Woah, I take that back. Digicert has now gone up from $74/year to $474/year, which is crazy. I now also need a new certificate provider...

      • NewsAware 5 years ago

        For Electron signing we use Tucows Code signing certs (you need to register as Tucows auther for free) which are provided by Comodo for $140 for 2 years. Didn't have any issues besides getting a proper CI/CD process running.

      • justinclift 5 years ago

        There aren't any great options, but if it helps we (sqlitebrowser.org) went with Certum:

        https://en.sklep.certum.pl/data-safety/code-signing-certific...

        We chose the "Open Source Code Signing" option, with it being stored on a physical keyfob thing (eg not "in the cloud"). Total cost, including the new key fob and super expensive, week+ delay, mandatory postage (!) was around 135 Euro.

      • nishkalkashyap 5 years ago

        For my project (quarkjs.io), I went for https://comodosslstore.com . They have the cheapest certificates I could find (at ~75USD), also they are the only ones issuing certificates for individual developers.

najarvg 5 years ago

This looks very good and a fit for my end users who deal with excel files all the time. Is there any plans to add Excel as a datasource? Cannot convert to CSV without major pain since excel files are exports from mainframe apps which are out of my control. Thanks

jbverschoor 5 years ago

The app icon is transparent on mac, and thereforce only clickable on the border

nishkalkashyap 5 years ago

I would recommend distributing binaries from a dedicated release server combined with a CDN. Possibly digital ocean spaces. It really increases download speeds for end user as compared to gitHub releases.

jbverschoor 5 years ago

Unfortunately it's created by an unientified developer

  • jessaustin 5 years ago

    Be sure and write some special firewall rules before running this...

chrsstrm 5 years ago

At what scale has this been tested? As in, are you aware of any data file size limits? I have a csv with ~6M rows and when paging through the docs the "Exploring your data source" gave me pause thinking this app might try and open all 6M rows at once. Will I be OK importing such a large source or will my computer turn into a space heater before refusing to respond?

  • bhou 5 years ago

    Ananas has been tested on production processing terabyte data on a daily basis (with Google Dataflow, but you can achieve the same thing with your own spark cluster too).

    In term of exploring large source file, the design principle is to paginate any kind of data that support random access records (for example CSV, logs, etc). So when "exploring the data" of a CSV with 6M rows, Ananas will not load 6M rows at once, but read a few rows at a time for each page. For example, in this early demo video, exploring a 755M CSV file in seconds. https://www.youtube.com/watch?v=GwqZlhmei78&t=01m00s

eli_gottlieb 5 years ago

Ok, but why did you name it after pineapples?

  • millboh 5 years ago

    Ananas, Analytics made easy :) Pineapple was cool too. Will probably change it if we see more comments ;)

    • hondadriver 5 years ago

      No just keep it. Its fun and why has the name to be an existing English word?

      Fun fact: if everybody starts using it, it will eventually become proper English.

    • SysINT 5 years ago

      Pomme-stylo-anana or name it after Pikotaro!

mingabunga 5 years ago

Thought about adding some words to the data output using natural language generation? Eg arria.com or other nlg vendor?

  • millboh 5 years ago

    Excellent idea. We've though about Machine learning transformer including NLP . This NLG is something which would definitively nice to have. Please create an issue and we will prioritize it.

jugg1es 5 years ago

Does this have any sort of hinting for indexed queries at all? I would worry that a beginner would create a horrid mess of queries that could consume all available resources.

  • millboh 5 years ago

    That's a good point. Actually we think of this tool as a collaboration tool which enables non technical users and data engineers to share this visual DAG and work together. The Apache Beam runners we use behind the scene have a Query planner to optimize chained queries . However you're totally right . This can't help a non technical users to write messy queries. The visual DAG should however helps them to split a complex query into simpler ones.

VvR-Ox 5 years ago

Oh nice - thank you very much! :-D

I thought about writing my own app for exactly that task but when I see yours I think I don't need to do that anymore. Awesome! :-)

richk449 5 years ago

Can I use this if all I have is an odbc connection?

  • millboh 5 years ago

    What kind of data source exactly do you need? We should be able to add Microsoft datasource such as MSSQL if you request it.

  • rmbeard 5 years ago

    I have the opposite issue is there a way to connect to a cloud database without an ODBC connection?

mtw 5 years ago

This looks great as a promise - I looked into visualizations provided, and we need much more than what's provided though

lucasverra 5 years ago

Can we hook this to an api GET request ? I guess i could API -> download JSON -> Ananas, but you know..:)

  • cr0sh 5 years ago

    I was thinking the same thing, for like IoT monitoring sensors, etc - but it is open source; grab it (once GH is back online), add the new "source", and issue a PR - that'd be the way to do it I think...

  • millboh 5 years ago

    The API data source is a great idea! You can create a feature request on our github. (Otherwise we will create it ) We will try to put that into following releases.

yazan94 5 years ago

This looks really cool! Thanks for sharing, I can't wait to test this

pplonski86 5 years ago

What is your business model?

overcast 5 years ago

My unprofessional professional opinion. The product looks great, but the name has to go. I can't imagine pronouncing that, let alone communicating it over a phone. Any simple word before analytics would be better.

Edit: pineapplytics is the obvious cute and available one, however may still be difficult to communicate.

  • henrikschroder 5 years ago

    Fun fact: Only the English language calls the fruit "pineapple", almost every other language calls it "ananas" or similar.

    • hondadriver 5 years ago

      We call pineapple juice 'ananassap' in Dutch and sometimes use it as if it would be an English word as a joke.

    • overcast 5 years ago

      I looked it up, I get it, but this post and the site, are targeting English speaking.

      • _frkl 5 years ago

        Well, or they just want everyone to be able to access it? There is really no choice than to publish something like this in English. Just a guess, but I'd guess the amount of people ccessing it who are not native English speakers is larger than those who are.

      • ygjb 5 years ago

        Now I really want the author to rename it pineapple in the English localization, and leave it Ananas every where else...

    • mikorym 5 years ago

      That's not true. In my native language it is called "pynappel".

      EDIT: I guess that may be why you said "almost" but that, in turn, is almost impossible.

    • pmelendez 5 years ago

      That's a bit bold statement... In Spanish is Piña and in Portuguese is Abacaxi, but a good amount of laguanges does call it ananas

      • caiocaiocaio 5 years ago

        In Portugal Portuguese it's Ananás.

        • michaelmior 5 years ago

          I think this is also true in the Spanish dialects spoken in many regions.

      • maximente 5 years ago

        > In Spanish is Piña

        equally as bold, and also incorrect: ananá is what you'll hear in Argentina, possibly elsewhere.

    • techie128 5 years ago

      Its called "ananas" in Marathi as well. Marathi is a regional language in India.

  • millboh 5 years ago

    Delighted that this is one of the first comments! :) The product was designed to make analytics easy. We found that the word Analytics is not easy to pronounce too. So we decided to make the word analytics easy too! But thanks for your comments, we will consider about it.

    • koolba 5 years ago

      I think the name and logo are nice (pineapple database right?) but agree that it's both difficult to spell and pronounce (particularly for people who refer to them as "pineapples").

      If you want to be cheeky, CONCAT(SUBSTR('analyst', 1, 4), 'desktop') is available for a .com domain.

      • bhou 5 years ago

        Love this CONCAT(SUBSTR('analyst', 1, 4), 'desktop') idea! ;)

  • dugluak 5 years ago

    People have same opinion about Azure. Microsoft didn't change it. Ananas is still not that bad.

    • tlrobinson 5 years ago

      "Ananas" alone might not be that bad, but "ananasanalytics.com" certainly is.

      Bob Loblaw's Law Blog, anyone?

  • caiocaiocaio 5 years ago

    I really loved the name, but there was a bit of disconnect because I thought something named pineapple would lead me to a more exciting, artsy page.

  • djmips 5 years ago

    your opinion is bananas

  • aiisjustanif 5 years ago

    Who said it was made by an English speaker...?

    Typical. Smh.

dlphn___xyz 5 years ago

what advantages does this have over the ELK stack?

tracer4201 5 years ago

No Redshift support? Hmmm.

  • bhou 5 years ago

    It is on our roadmap! We will continue adding more data sources in the following release.