Introduction to MongoDB

Hi folks,

Its been really long since I blogged and this time I am doing it from Pittsburgh! I have been working on understanding MongoDB. It is a non-relational JSON document store.It has no support for relational algebra (SQL) .every JSON document consists of key-value stores {‘firstname’:’mg’,’lastname’:’gpl’,’hobbies’:[‘cycling’,’swimming’,’singing’]}.The db could have a collection of such documents.It is schema-less or dynamic in the sense that it allows creation of documents in the same collection  but not necessarily with the same schema.

Relationships between tables (relational world) are stored as maps in BSON format (that can be serialized to a JSON object) on usage.It doesn’t support joins,transactions and SQL. In other words,it is a document database.Looking at the history of database evolution, analyzing from the perspective of Scalability and performance and functionality we can see that Db’s like Memcache, key value stores are high on the Scalability end but low on the functionality end.Contrast, RDBMS’s are high on the functionality end.But Mongo DB is somewhere in between where it provides a high performance retaining good functionality.MongoDB doesnt do joins and transactions since data on multiple nodes cannot be joined and atomic transactions cannot be supported due to concurrency issues.

So, this is just to give you an intro to mongoDB. Stay tuned for more about MongoDB. See ya.

 

 

Posted in Uncategorized | Leave a comment

IBM TGMC

Hello everyone. It’s been sometime since I last posted. This blog shares my experience in the IBM TGMC contest for College students.

The contest starts by providing a set of case studies like smart city, green planet etc which are some of IBM’s ideologies on making this world a better place to live.

So, there is an initial meetup between the IBM guys and college HOD’s , staff, students where there is an expostulation of IBM tech trends and software such as IBM DB2, IBM Websphere server, Rational development suite for apps etc, this serves as an initial session to set up a good wrappout .
Following this, teams register online in batches of 3 to 5 , choose their app .Mentors can be chosen both from the College faculty as well as from IBM.To gauge the interest, level and skill of the students and to rationalize the expected skill set from the existing one , intermediate performance assessments are done such as submission of use case diagrams, synopsis, design document etc,Working on this project made me aware of IBM parlance such as Business logic, database servers vs databases!! , integrated development and deployment framework etc.
This was followed by the actual coding and testing of the app , packaging the code as EAR/ WAR and submitting it for review by IBM.This is the basis for short-listing the top 50 teams or so who will make it to successive rounds.
So, I received a certificate of appreciation from IBM for implementing the case study on E- COPS. It was great fun and a absolutely awesome exposure to the practical side of coding apps with appropriate industry level exposure.
Like-minded geeks can find my certificate and docs attached. So, start Apping….
E-COPS
TGMC-cert

Navigating the Seas of Enterprise Data Management

In the ever-evolving landscape of business, data has emerged as a valuable asset that can fuel growth, innovation, and strategic decision-making. The effective handling of vast amounts of data is paramount for organizations to stay competitive and agile. This is where Enterprise Data Management (EDM) steps in, providing a structured approach to harnessing, organizing, and leveraging data across the enterprise.

The Foundation: Data Governance

At the heart of any successful EDM strategy lies robust data governance. It sets the rules, policies, and procedures for managing data throughout its lifecycle. By establishing clear ownership, defining data quality standards, and ensuring compliance with regulations, organizations can build a solid foundation for effective data management.

Centralized Data Repositories

One key aspect of EDM is the creation of centralized data repositories. These repositories serve as secure, organized hubs for storing and accessing data. Implementing a centralized data storage solution not only enhances data accessibility but also facilitates efficient data integration across various departments, reducing silos and promoting collaboration.

Data Quality Assurance

Data quality is non-negotiable in the world of EDM. Poor-quality data can lead to misguided decisions and operational inefficiencies. A comprehensive data quality assurance process involves regular audits, cleansing, and validation to ensure that the data within the enterprise is accurate, consistent, and reliable. This commitment to data quality enhances trust in the information used for decision-making.

Master Data Management (MDM)

MDM is a critical component of EDM, focusing on the identification and management of master data—core business entities such as customers, products, and employees. By establishing a single, accurate, and consistent version of master data, organizations can improve operational efficiency, reduce errors, and enhance the overall reliability of their data.

Data Security and Compliance

As data breaches become more sophisticated and prevalent, ensuring the security of enterprise data is paramount. EDM includes robust security measures to protect sensitive information, incorporating encryption, access controls, and regular security audits. Moreover, compliance with data protection regulations, such as GDPR or HIPAA, is integral to safeguarding the organization against legal repercussions and reputational damage.

Scalability and Flexibility

The business landscape is dynamic, and so should be the data management strategy. Scalability and flexibility are crucial aspects of EDM, allowing organizations to adapt to changing business requirements and technological advancements. Cloud-based solutions, for example, provide the scalability needed to handle growing volumes of data, while also offering the flexibility to integrate with emerging technologies.

Continuous Improvement through Analytics

EDM is not a one-time implementation; it is an ongoing process that thrives on continuous improvement. Analytics play a pivotal role in this evolution, providing insights into data usage, performance, and trends. By leveraging analytics tools, organizations can identify opportunities for optimization, address inefficiencies, and stay ahead of the curve in the data-driven landscape.

Conclusion

Enterprise Data Management is the compass that guides organizations through the complex seas of data. By establishing strong data governance, implementing centralized repositories, ensuring data quality, and embracing scalability, organizations can unlock the true potential of their data. In an era where data is king, a well-crafted EDM strategy is not just a necessity; it is the key to staying competitive and thriving in the digital age.

Posted in Uncategorized | Leave a comment

Semantic Web III

Hello everyone.It is a rather long and sultry afternoon in Chennai and here I am writing my final post of Semantic Web technologies!

So, what exactly is Semantic Web and what are the differences between Semantic Web and Semantic technologies.Semantic technologies include several technologies such as Natural Language Processing (NLP) , Data Mining, AI (Expert Systems) etc that try to add conceptual meaning to real-time apps .Pls note that anything other than what is specified in the technology stack in my previous blog comes under the category of Semantic technologies.

To make you guys aware of some of these basic technologies, here goes my explanation for the same.

Natural-language processing (NLP) :NLP technologies attempt to process unstructured text content and extract the names, dates, organizations, events, etc. that are talked about within the text.                                                                                                                                                      Data mining: Data mining technologies employ pattern-matching algorithms to tease out trends and correlations within large sets of data.                                                                       Artificial intelligence or expert systems: AI or expert systems technologies use elaborate reasoning models to answer complex questions automatically                                     Classification : Classification technologies use heuristics and rules to tag data with categories to help with searching and with analyzing information.                                               Semantic search: Semantic search technologies allow people to locate information by concept instead of by keyword or key phrase.

In other words, Semantic Web consists of a data model and a methodology to implement Semantic technologies , so that is the relation!

I would also like to touch upon some of the core concepts of NLP for those interested.This basically encompasses several fundamental apps such as Search, Auto categorization, Question Answering(Siri & Watson), Summarization of text , Sentimental Analysis to indicate positive or negative emotions in text corpora on a specific topic, so on and so forth.

Coming to some of the tools that are used to implement Semantic Web technologies, one could start with Protege which is a Stanford open source tool to build ontologies.This also includes an integrated Reasoning and Annotation service. Creates java XML schema based files with the domain specific vocabulary that can be consumed by Web based services or programs.

So, start playing around with Semantic Web, do a few basic ontologies to get the feel and get started on your way to building complex services that can use these integrated schema and do intelligent factual reasoning.

Do keep me posted for comments and suggestions and on any problems that can be solved by Semantic Web.I have also attached a presentation on an advanced inference engine for Diabetes domain modelled based on the Semantic Web technology stack That’s all for now, catch you in my next mini-series of posts on advanced technology topics. bonne journée

Semantic Web

Posted in Uncategorized | Leave a comment

Semantic Web II

Welcome to my second post on Semantic Web and its real-time apps So, to get started I would like to recap some of the points I enumerated on in my last post.

We discussed what is the scope, Technology landscape and research issues and topics revolving around Semantic Web. In this post, I would like to discuss some of the more involved and intricate terminologies- RDF, RDFS, OWL etc.

To give you a real-time example of how a concept is modeled, check out the following diagram

Image

The ovals indicate the entities and links depict the relationships among the entities.

Where Open Data meets the Semantic Web it results in Linked Open Data.While modeling such entities, it is important to keep the rules of Linked Data in mind such as 1)Use URIs as names for things. An example of a URI is any URL.For Example : http://www.domain.com/whitehouse is the URI for the resource White house on the internet.2) When someone looks up a URI, provide useful information, using the standards such as RDF and SPARQL. These terms may sound a little new- but dont worry! I will be covering more in-depth the technology hosted by Semantic Web in my later posts.

The realization of Semantic Web is possible by the use of languages that provide: a) Formal syntax and formal semantics to enable automated processing of content.b) Standardized vocabulary enabling automatic and human agents to share information and knowledge.

Keeping these requirements in mind, several languages serve as the basis for building a standardized domain vocabulary the foremost being RDF, RDFS and OWL.To depict the technology stack of the Semantic Web, the following diagram would make sense from an architectural point of view.

Image

Coming to the Resource Description Framework(RDF), 1. It is a standard for Web meta data developed by the World Wide Web Consortium (W3C) 2. It is suitable for describing any Web resources 3.Provide interoperability between applications that exchange machine-understandable information on the Web

Adding a further note on RDF , It is an XML application and adds a simple data model on top of XML.This data model provides three elements:•Objects •Properties •Values of properties.

Image

In case of RDF Schema (RDFS), it defines additional modelling primitives on top of RDF and allows one to define Classes (concepts), I the Inheritance hierarchy for classes and properties and  domain and range restrictions for properties.In other words, RDF is the building block of RDFS which in turn helps in the formulation of OWL (domain vocabulary specification)

An ontology is a shared and common understanding of some domain that can be communicated between people and application systems. It  is a formal, explicit specification of a shared conceptualization.

Looking at all these generic terms, we need to define a use-case for the kinds and satisfiability of tools that mark up vocabularies for the Semantic Web. Few key requirements being  Formal languages to express and represent ontologies , Editors and semi-automatic construction to build new ontologies , Reusing and Merging Ontologies ,Reasoning Service,     Annotation tools , Information access and navigation , Translation and integration Services

I will be covering more details of the tools and open source projects as well as other related research topics revolving around Semantic Web in my next post.Till then cya!!

Posted in Uncategorized | Leave a comment

Mind Mapping Tool

Hello everyone. Its a late evening in Chennai and here I am blogging about one of my earlier projects.

   I was part of an initiative for developing a Mind Mapping software that could accurately predict a users precept and emotions while he browsed.This started off pretty well at a conceptual level and I came up with the high level architecture for the same.

     According to me, any query posed by the user needs to be semantically analyzed by the system.Such a system should not only act as the harbinger of new information but should keep a map of similar user’s mind maps to suggest content and present it to the user with higher reliability and accuracy.

     So, to start we take the user’s searches (history), try to analyze and come up with a pattern to predict his behavior.We can go for tools like Fuzzy Cognitive Maps because a user may or may not accomplish a particular task with a certain probability ‘p’. And what better way to characterize this than to use a fuzzy cognitive map.

    Also, though a lot of work has been done by Google in terms of adding Semantic meaning and knowledge networks to existing indexed objects , a lot more scope exists for images.This would lead one to naturally conclude that a combination of text and images mapped to a user’s mind map would accurately present required content and predict behavior.

       There is a also a need for a liaison between the 2 components we have identified.This calls for a decision learning system such as a Bayesian classifier to map these results and to learn in such an environment. Also, there is a serious question of the accuracy of the system that can provide such flexibility vs the accuracy of content as anticipated by the user.

       Modelling using all these constraints, I have come up with a suitable architecture as shown

Image

 

Posted in Uncategorized | Tagged , , | Leave a comment

Cloud services

This is a short post on the state of art technology in cloud.
As all of us know cloud technology is making strides across the globe as means of providing a sustainable infrastructure.
I have identified a couple of courses on IBM’s Big Data University that provide lectures and quizzes that test your knowledge of cloud and Web services.
These courses are good from the point of view of exploring cloud architecture, infrastructure and services. On successful course completion, you can download a pdf of the certificate signed by esteemed instructors from IBM

I have attached mine to the post
Amazon-cloud-certificate
certificate-2

Posted in Uncategorized | Leave a comment

Finance Forecasting Tool

Hi everyone,
I have attached my architecture / model of a finance forecasting tool
This has been done by playing with data both from stock quotes as well as real time data from Twitter analyzed for the sentiment of a particular stock.

If anyone is interested in working on this, do drop a comment, I would be extremely happy to crowd-source and see how meticulous this kind of system would be from a user’s point of view

Finance_Forecasting_Tool

Posted in Uncategorized | Leave a comment

Work experience

I joined TCS as a trainee directly from university in 2011.I did my Initial Training Program(ILP) from August ,2011 to November ,2011.I did a deep dive in Java. Since I outperformed my fellow trainees , I was awarded “Star of the Learners Group” award.

As part of my ILP, I developed an application for National Self Employment program using Java and Oracle d/b technology.This app catered to the need of different users such as Colleges, banks and NSEP employees.

This was followed by my project experience in TCS Web 2.0 R&D Labs for about a year in which I designed and developed algorithms and architectures in the area of Semantic Web, Social Networking analysis, Employee Reputation model and Machine Learning.I also have vast experience writing an industry ready patent on the same and have written several widely accoladed white papers.

I was then part of the TCS – IBM Centre of Excellence in which I familiarized myself with Mainframes technology and developed a cloud on IBM System Z.I then moved to LLoyds Banking Group in TCS where I am currently exploiting the use of client side technologies for the browser such as Ember js and Angular js that talk about implementing MVC pattern on the client side .This technology is currently making great waves in core designing applications for core industries such as Banking, Finance, Manufacturing etc.
accolade1

Star Of the Learners Group

Next Generation Email Issues and Solutions

Posted in Uncategorized | Leave a comment

IICAI paper presentation

I am an avid research enthusiast.So, this is one of my passionate posts about my experience in paper presentation in the 5th Indian International Conference on Artificial Intelligence (IICAI) 

I presented a paper on knowledge harvesting from software documentation.As many of my readers will appreciate, the traditional development , deployment and maintenance of conventional software applications require higher quality with shorter time to market cycles to reap the benefits of customer delight. They follow a conventional , explicit and formal representation of the knowledge base which is shared across the stakeholders .

Hence , the associated documentation across various stages of SDLC do not cater to any intelligent extraction and interpretation either for downstream applications or for enhancements. The research framework I proposed addresses the nuances of converting the existing documentation to an intelligent knowledge representation which can be extracted for effective and efficient utilization of the artifacts

I have also attached the certificate I was presented with on successful completion of the conference.Also, find attached the presentation given

This was a mind blowing experience in collaborating with several dignitaries, professors and other paper presenters from research institutions and academia.Image

Knowledge Modeling from Software

Posted in Uncategorized | Tagged , | Leave a comment

Semantic Web

This article provides a short introduction to Semantic Web and it’s realtime applications in industry meant to appeal to amateurs as well as professionals.

Delivered a lecture on the same in Sairam College of Engineering as part of TCS Academic Interface Program (AIP)

Semantic Web and Research Issues

      This article comprehensively covers the following topics

  •      Semantic Web Technology Landscape
  •      Semantic Web –an Introduction , Issues, Problems and Solutions
  •      Research Areas                                                                                                                                                        

      The following vision of Semantic Web was given by the father of Semantic Web – Tim Berner’s Lee                                                                                    

     ”  I have a dream for the Web in which computers become capable of analyzing all the data on the Web –the content, links, transactions  between people and computers. A ‘Semantic Web’ which should make this possible, has yet to emerge, but when it does , the day-to-day mechanisms of trade, bureaucracy and our daily lives will be handled by machines talking to machines ”    

 Semantic Web Technology Landscape

         3 points that are relevant from a Semantic Web technology perspective are:

  •            How Semantic Web technology fits in to the past, present, and future evolution of the Internet                                                          
  •            How Semantic Web technology differs from existing data-sharing technologies, such as relational databases and the current state of the World Wide Web
  •            The three primary international standards that help define the Semantic Web  

           Semantic Web-An Introduction      

          The following depicts WWW’s influence on electronically available information

           1990: In house documentation supporting around 1000 users
           2002: Thousands of users and documents
           Current Trend: 1 billion static documents                                       
                                 200 million users             
           

         Problems in WWW include:

  •       Difficulty in finding ,accessing, presenting and maintaining information for a wide variety of users
  •       Presentation of browser content in Natural Language form
  •       Huge gap between information available for tools & information kept in human readable form .                                                                                

          Solutions to address the problems of information heterogenity in WWW include:

          Requirement for a machine understandable semantics for the information present in WWW.

          Achieving a Semantic Web requires:
 
              Developing languages for expressing machine understandable meta information for documents
              Developing terminologies (i.e., name spaces or ontologies) using these languages and making them available on the web.
              Developing tools and new architectures that use such languages and terminologies to provide support in finding, accessing,presenting and maintaining information sources.
              Realizing applications that provide a new level of service to the human users of the semantic web.                                                                             
          
         Research Areas in Semantic Web among many are
        •  Databases
        • Intelligent Information integration
        • Knowledge Representation
        • Knowledge engineering
        • Information agents
        • Knowledge management
        • Information retrieval
        • Natural Language processing
        • Meta data
        • Web standards                            

     

Posted in Uncategorized | Leave a comment