From Attic to Oracle: The Ascension of a Data Warehouse Maverick

Manna Mahmud
9 min readApr 28, 2024

--

The Reason

During my last year at university, I stood before a significant challenge: the final project or thesis that all students must submit. Amidst this pivotal moment, my personal life was in disarray. My relationship with my father had hit rock bottom, as every day brought a new argument. I recognize now that much of this was because of me — my passion for music, my long hair and cowboy look, and the countless hours I spent with friends likely painted me as a lost cause in his eyes. Despite this, I was actually doing quite well academically, preferred by my teachers for my quick understanding of system design and complex database queries.

When it came to choosing my final year proposal, I aimed for something that would ensure an easy pass. I scoured the library and stumbled upon the work of a senior student from the previous year: ‘Optimizing Data Warehouse Design.’ Seizing this opportunity, I tweaked the title to ‘Fine-tuning Data Warehouse Design and Development,’ and confidently submitted it to my supervisor, hoping it would seem like a fresh idea. This supervisor was not only an expert in advanced databases but also a personal favorite and future reference for my academic career.

Upon reviewing my proposal, however, my supervisor saw right through my strategy. With a knowing smile and a swift stroke of his pen, he crossed out my title. He then wrote something new, handed it back to me, and said, I must revise it accordingly and not a thesis, it must be a project. The latest title, ‘A Tool to Design and Develop Data Warehouses,’ was far from what I expected. It suggested I should create an actual software tool for designing and developing data warehouses, complete with features like schema designing and ETL pipeline management — a task that seemed nearly impossible at the time.

Bewildered, I tried to explain to him the enormity of what he was asking. In response, he simply chuckled with confidence and turned away, making it clear there would be no further discussion. I stood there, partly overwhelmed by his belief in my abilities but mostly panicked at the scale of the project.

And yet, despite my initial fear, a small, excited voice inside me dared to say, “Hey, let’s do it. It’s going to be a fun ride.” With that thought, I decided to accept the challenge, not knowing then that it would lead me to leave my house, rent an attic, and spend the next year and a half creating the tool.

The Formation

With a foundation in Java Swing and the Eclipse IDE, I had my toolkit ready, and my purpose became clear: I would construct a sanctuary for data warehouse specialists — a single environment to accommodate their every need. My ambition demanded more than just academic success; it required isolation, a space where I could concentrate free from the familial chaos.
The call for independence was not just a matter of pride; it was a crucial step in my journey. Whether to shock or wound my father — or perhaps both — I made the decision to leave home. My search for a new abode led me to various corners, seeking a place that would not burden my modest student budget.
Then, as if by fate, I found it — an attic room perched atop a humble building, perfectly aligned with my financial constraints. It was an unusual choice but one that resonated with the sense of retreat I sought.
Luck stayed by my side, as two confidants, fellow scholars driven to master the intricacies of Oracle DBA, chose to join my newfound abode. They saw in me a guide in the world of databases, as I was known to navigate Oracle 9i with ease.
We three migrants of academia assembled our humble rigs in the attic, crafting it into both a study and a haven. While my friends maintained the tether to their homes, coming and going, I was steadfast in my new lair. I had set upon a path from which there was no easy return — not until I could face my father as someone who had not just left but had triumphed.

The Study

My quest for knowledge led me to devour any literature on data warehousing, metadata modeling, and management that I could extract from the library or the vast expanses of the internet. In the early 2000s, Oracle 9i was like a beacon of enlightenment; it offered a wealth of white papers on data warehousing techniques, data mining, the intricacies of materialized views, advanced streaming, and the workings of SQL*Loader. These resources became the cornerstone of my study.
Despite this arsenal of information, I found myself at an impasse: the daunting question of where to begin. Oracle 9i, fortunately, was written in Java — a language I was intimately familiar with. I delved into its code, deconstructing and reconstructing it like a child who learns through the relentless dismantlement of toys. This vigorous cycle of experimentation was my harsh yet effective teacher. It schooled me in the arts of multithreading, the elegance of observer patterns, the significance of contexts, the subtleties of different logging mechanisms, and the classifications of data.
Through Oracle 9i’s lens, the server world unfolded before me, revealing secrets of server-client architecture, remote method invocation, packaging, and service orchestration.
Even with this newfound knowledge, a starting point for my project eluded me — until serendipity struck. One of my friends, rightly dubbed an angel, presented me with a CD brimming with PDFs on data warehousing. My hopes were not high, as my own collection was extensive. To my surprise, within this digital trove lay the key to commencing my journey — a white paper detailing Oracle’s Sales History Schema, complete with DML scripts for a star schema and a wealth of example data.
With this document in hand, I leaped towards my computer, eager to unlock the Oracle 9i schema. It was as if the fog had lifted, and at that pivotal moment, I knew exactly where to begin my odyssey into the world of data warehousing.

The POC Design

With the blueprint of Oracle’s Sales History (SH) schema firmly etched into my strategic plan, I settled on it as the archetype for my data warehouse (DW) schema. The groundwork for my project began to solidify as I designed a set of heterogeneous transactional demo sources — initially focusing on relational database systems (RDBMS) and Comma Separated Value (CSV) files. I enlisted the capabilities of Oracle tools such as SQLPlus (Sqlq) and SQLLoader, leveraging them as the vehicles to transport and transform data from the source to the target data warehouse, effectively mapping my Extract, Transform, Load (ETL) journey.
The foundational elements of the Source, Mapper, and Target were locked into place, creating a clear path forward. The next critical component required was the Metadata layer — a Semantic Layer that would serve as the bedrock for mapping complex data structures to a more readable format, thus facilitating user interaction and understanding. Alongside this, an Analytical Report Designer was essential, one that could harness the Metadata to construct and render insightful reports.
However, recognizing the magnitude of developing such a tool from scratch, I opted for a more strategic and time-efficient choice. I integrated Oracle Discoverer into my project design, a decision that served dual purposes. It not only demonstrated the potent capabilities of connecting with the target data warehouse and performing data analysis but also provided a suite of templates for reports that could be further customized using Oracle Report Designer.
With these decisions, my project was no longer an abstract vision. The basic logical skeleton — the inputs, the processing framework, and the output mechanism — had taken shape, outlining a clear structure for the proof of concept (POC) I sought to create.

The Coding

The architecture of the user interface crystallized in my mind: a straightforward three-column layout to shepherd the user through the process of data warehousing. The first column would stand as a navigational tree, rooting each data warehousing project as a parent node from which sprang children nodes representing sources and targets. Intuitive interactions were key; a right-click on a source or target would unfurl a menu of connection properties and design utilities.
The central column was resolute in purpose — a multi-tabbed editor where one could view and interact with an editable schema. Its counterpart, the right column, served as a dynamic display for the properties of the DDL (Data Definition Language) components, revealing detailed attributes and offering contextual actions upon right-clicking on schema elements, such as mapping connections or formatting options.
Leveraging my proficiency in object-oriented principles and Java, I wove a web of command and observer patterns, binding actions to listeners with meticulous care. However, the challenge presented itself when JDBC connections proved inadequate for the miscellaneous database operations I sought to perform. One of my showcase features was to spin up a target DW for retail sales using a pre-configured Oracle 9i instance in a mere 30 minutes — a stark contrast to the standard 90-minute installation.
My intimate knowledge of Oracle 9i’s architecture was the key to this feat. I had deciphered and mapped the variables of a bare Oracle 9i installation: paths, listener ports, control files, data files, log files, and environmental variables — and turned them into a templated clone. This allowed me to replicate a fresh Oracle instance rapidly, replacing default values with user-defined parameters.
Yet, the daunting array of shell scripts that empowered these actions quickly became a labyrinth of complexity. They numbered in the hundreds, necessitating an efficient management and exception-handling mechanism outside of Java’s native environment.
My break came serendipitously in a conversation with a friend — a recent victor in a UK programming competition and a Java maestro. He introduced me to Java’s Runtime API which could manipulate the input, output, and error streams of processes. This revelation was empowering. Armed with the Runtime API, every action could now be underpinned by script execution, funneling parameters from my data warehousing IDE, managing output and error streams, and alerting users to the outcomes of their operations.
The result of relentless study, design, and coding for over one and a half years began to show life. My tool worked flawlessly in ideal conditions, provided all inputs were precise. It captured and visualized Fact and Dimension tables, allowed metadata encapsulation, channeled data into predefined report templates, conjured up reports and charts, and even integrated smoothly with Oracle Discoverer.
Demonstrating this intricate web of data processes was not simple as the subtleties of data warehousing were lost on most who surrounded me. Yet, my supervisor, equipped with substantial knowledge, was thoroughly impressed — a testament to the journey I embarked upon, from a scholarly attic-dweller to the creator of a functioning data warehousing tool.

The Big Picture

In the vast panorama of my journey, the tool — my creation — remained a personal milestone rather than a commercial triumph. It was riddled with bugs, an incomplete tapestry not yet ready to be unfurled in the bustling market of polished products. The endeavor lacked the refinement of thorough market research, making it more of a proof of concept than a viable contender. This was not an end, but rather a transformative beginning; the effort imbued with care and sincerity forged the professional I became — it was never in vain.
As the year unfolded, I was introduced to the Pentaho Business Intelligence (BI) Suite and Kettle, and my previous toils ensured that mastering these tools was but a brief adaptation. In no time, I was not just conversant with Pentaho but could confidently navigate its intricacies as an expert.
An opportunity arose, whimsical in its advent, to participate in a Data Warehouse Expert exam. To my surprise and without prior strategizing, I attained a recognition that resonated across my world — the title of the 3rd Top Data Warehouse Expert on Odesk. It was a pivotal moment, a testament to the knowledge and skills I had honed.
What followed was a deluge, a surge of ETL and BI projects on freelance marketplaces. My profile was now synonymous with expertise, and the demand for my skills was relentless. The attic, once a shelter for a student’s ambition, had metamorphosed into the crucible that tempered a professional’s success, a stepping stone to a history that was mine to write. Several years later, when I returned home as a Senior Software Architect from a prestigious company in the Netherlands, for the first time and last time in my adulthood my father (He passed away a few years later) gave me a full-blown hug and in that singular and final moment of paternal pride, every piece of the journey — the attic, the coding, the accolades — snapped into place, a poignant realization that this embrace was the silent accolade I had yearned for all along.

Disclaimer: The views reflected in this article are the author’s views and do not necessarily reflect the views of any past or present employer of the author.

--

--