As a SQLdep founder, I am happy to announce our partnership with Collibra. Collibra is a leading Data Governance platform for Master Data Management. With the total funding amounting to $134M and their revenue doubling every year, Collibra is on the way to becoming the #1 solution for metadata management.
Our dev team implemented yet another database dialect. SQLdep now supports IBM Netezza which is used for Data Warehouse appliances and advanced analytics.
For early birds we offer a free unlimited tier till end of June 2018. You are welcome to upload your Netezza SQL scripts and we will take care of the rest. All metadata can be downloaded in CSV format as well. Get in touch and enjoy the free tier!
We are happy to announce a support for Snowflake SQL dialect. All SQLdep features like automated documentation of ETL processes and capturing end-to-end data lineage are now available for Snowflake SQL dialect.
Snowflake itself is a highly optimised database for Data Warehousing and it runs in cloud. The company was founded in 2012 and by early 2017 they have completed D round and raised $100M.
On Thursday 27th of April 2017 we have a scheduled downtime. The migration to a new server will start at 10:00am GMT and will be finished within 60 minutes. During the migration we will also update DNS records (domain name remains the same).
We are switching to a more powerful server and also changing internal infrastructure. Hypervisors will be installed to enable clustering and to obtain a fully redundant server platform. We do appologize for the inconvenience.
Our business glossary add-on helps the stakeholders in the company to actively manage and describe the data. The key feature is semi-automated functionality when most of the glossary is pre-populated from the existing sources. Thus manual input needed by data stewards is greatly reduced.
Whenever you use SQL on daily basis you should make sure that everybody on the team uses the best SQL coding practises. Couple of well placed rules goes a great length when you need to analyse SQL code written by somebody else.
I have never understood how somebody can omit using a fully qualified column name. It just takes 2 seconds to place table alias in front of the column name.
The BI team in Makro faced a challenge with keeping track of the data lineage on their Teradata DWH. Just on the Czech market the company's data grew to 1.5TB in size and the data spread over 10,000 tables and views. It proved more and more difficult to perform impact analysis over multiple data layers, i.e. from the staging layer up to the reporting layer. The senior developer had to manually trace the lineage across the ETL job.
We are moving forward to the launch of SQLdep QueryScope. In one week we are launching the private beta! Here is the first design of the user interface. How do you like it? We are looking forward to reading your comments or seeing you among the beta testers (sign-up here).
Data lineage is essential for understanding data flows in your BI ecosystem (ETL jobs, SQL based reporting, ). A proper data lineage helps your developers to quickly identify impact of the change in SQL while keeping the audit trail behind. Finally BI end users are bound to have an access to data lineage as well so they can successfully leverage the value of data itself.
There is a big productivity shift about to happen in the SQL world. Till now, data analysts and developers were performing SQL code analysis manually and such an approach was a widely accepted standard. This performance killer ranges from 10 minutes on a moderately complex query till 2+ hours with a couple of complex queries.
10 years ago business intelligence became an inseparable part of every company. Every company heard that to achieve great success in the next years, they have to start using the data. So they created new BI teams, built new DWH architecture, tried to find the balance between servers and cloud, migrated to SaaS, and so on.
It’s early in March 2014, and I’m in Prague, Czech Republic - one of the global capitals for delicious beer, beautiful architecture, and a growing technology community. I’m speaking with Martin “Masi” Masarik, and Miroslav “Madhouse” Semora about SQL development, data analysis, and their views on the SQL space, including their reasoning for devoting the last year of their lives to building SQLdep, a visualization and analysis tool for SQL developers and business intelligence analysts.
Plenty of business intelligence teams maintain text based metadata. To write something is easy to start with and you gain results quickly. Speed bumps occurs with increasing metadata complexity and thus lineage becoming really intertwined. It is like if someone gives you a textual description how to drive from New York to Philli. Wouldn’t you be like -- “Hey, buddy, gimme a map instead!”?
Deadlines often are not met by developer teams regardless of what they are working on. Software development always takes longer than expected, mainly for three reasons. Scheduling is insufficient as it is done by managers, who often have little or no developer experience. Additionally, the team does not always have all the tools they need and of course, development always runs into problems that are not always accounted for. When management scopes out the timeline for a project, they often forget to consider time related to development where actual development does not take place.
Who has a blank stare even after 2 months in his new position? The answer is: a junior developer you hired and showed your ETL. I am talking ETL with hundreds of intertwined procedures. To be fair, a senior guy would have to chew on it too but they’re less likely to choke.
The core of good business intelligence is a data warehouse, where information from various sources can be collected and organised in an effective manner to provide great results. Using a wealth of data helps provide analytical BI which not only allows you to react to what has happened, but predict what’s going to happen.
Do you know what’s difficult and prone to errors? Interpreting a complex SQL query that someone else wrote a year ago. Do you know what makes this scenario even worse? Having two different versions of such a SQL query, and it being up to you to figure out the difference.
I love the SaaS services where you can immediately try the tool on a real life use case. Since SQLdep is focused on SQL code parsing a challenge was finding a really simple way to get SQL code to our REST API. The goal was that nobody should spend more than 5 minutes to submit their complete SQL codebase to us.
Alright, the buzzword again — big data. However this time from the perspective of a relational database. In particular, Teradata had a sweet idea to bring big data to relational databases. In 2011 they acquired Aster Data Systems enhancing their product portfolio with Teradata Aster Database.
Having your metadata under control and thus providing data lineage is a toughie. Not to mention another level of complexity when data travel across multiple databases. Especially through separate instances, thus when querying data through the database links are applied.
Today, at SQLdep we proudly release a major feature that’s a must-have for any Business Intelligence ecosystem. To explain this feature in all its glory let’s talk about the Wayback Machine.The Wayback Machine is a service that keeps track of any page on the Internet over time. About a year ago the size of this archive was 15 petabytes.
Stopping by the booths at Strata-Hadoop, I was struck by how many tools out there generate code. The idea is that non-technical users can create data flows with visual tools, and the nasty complexity of code is hidden from them. In theory, this sounds great. You save time, and, in the short-term at least, development resources.
I was one of those developers in a large corporation who are running around undocumented code like headless chickens. I routinely analyzed thousands of lines of SQL code and a month later would have to do the *exact* same part from scratch. We were a team of 30 developers and every single day each of us spent 2 hours on SQL code analysis. Hideous routine for us and a really expensive task for the company.
Gartner recently released their 2014 Magic Quadrant for Data Warehouse Database Management Systems, and perhaps not surprisingly, it describes both the rising influence of big data and cloud, as well as the investment by traditional vendors into their data warehouse offerings.
So, here’s the issue: You’re a data analyst in a large company, working with a datawarehouse and you are part of the team responsible for reporting. You have a fairly complex set of SQL queries that set-up regular reporting, compute various metrics like profitability, prepare numbers according to account regulations like US GAAP, and much much more.