Wednesday, July 20, 2016

Fast Track SSAS and MDX Training using SQL Server 2016

I'm reading: Fast Track SSAS and MDX Training using SQL Server 2016Tweet this !
It's a long time since I blogged, as I have been very busy with my authoring assignments and my regular day job.

I have created a course to learn SQL Server Analysis Services ( SSAS ) and MDX on fast track using SQL Server 2016.

In case you would like to subscribe to the course, here's the link:

https://www.udemy.com/ssas-sql-server-analysis-services-2016-mdx-training/?couponCode=PROMO50

By using this link, my blog readers can avail 50% OFF on the course price till end of July. I hope you find the course useful.

Monday, June 16, 2014

SQL Server vs MongoDB vs MySQL

I'm reading: SQL Server vs MongoDB vs MySQLTweet this !

Microsoft SQL Server is one of the mainstream databases used in most operational systems built using Microsoft technology stack. One of the biggest shortcoming is the inability to support horizontal scaling / sharding. So the next logical choices that are most nearest to SQL Server would be MySQL.

In case you are looking for horizontal scaling / sharding, that would mean that you are gearing up to deal with Big Data. MongoDB is the arguably the first logical step in NoSQL world, in case if someone is considering to experiment with NoSQL to handle BigData.

At the stage, one is faced with the requirement to compare all these databases. Below is a quick comparison of these databases, with limitations highlighted in red and product strengths in blue.


Reference: DB-Engines.com

Saturday, June 14, 2014

Elasticsearch vs Solr vs Endeca vs Sharepoint FAST vs Google Search Appliance ( GSA ) vs Autonomy vs Semaphore

I'm reading: Elasticsearch vs Solr vs Endeca vs Sharepoint FAST vs Google Search Appliance ( GSA ) vs Autonomy vs SemaphoreTweet this !
Enterprise Search is a huge market. Fortunately there are just a handful of products out there to cater this business and unfortunately there is no one-product-fits-all kind of product out there.

There are specific category of features expected from an enterprise search product, which makes it suitable for one or other requirements. Some of them are listed as below:

1) Crawling
  • Web Crawling: An enterprise has most of the content on portals in the form of html and media documents. A crawler is the basic means to create an index out of this content.
  • DB Crawling: Data stored in databases often needs to be crawled or imported into the search inventory.
2) Taxonomy

Taxonomy is the logical organization of content in the enterprise content management system. Some term it as metadata or structure or term stores of the index maintained in the system.  It's the method of framing structure around the content, so that information can be retrieved more effectively and precisely.

For example, a very simple way of implementing taxonomy can be the ability to tag content using a set of keywords defined centrally at the organization level.

3) Specialized OOB Search
  • Faceted search (like the ones when you use Amazon and a set of categories appear of the left side)
  • Dictionary based search (where you look for a word and its synonyms)
  • Auto-suggest (for example when you type terms in google and it suggest few phrases)
4) Plugability 
  • Ability to index SMTP server
  • Ability to index LDAP server
  • Out-of-box ability to index any such external systems
Systems like Google Search Appliance, Oracle Endeca, HP Autonomy, Microsoft Sharepoint search, and Solr are the top leaders in this category. Products like Smartlogic Semaphore add a value added layer on the top of it.

But the big question is where does products like Elasticsearch fit here ? 

While we looked at the positives of these products due to their ability to provide the above mentioned features, there are some downsides / limitations too, where Elasticsearch or even Solr steps in.

1) Any of these products are not economic. For example, HP Autonomy is heard to have the base price of more than half a million dollars. Every enterprise may not have the budget to afford it.

2) Some products do not support database indexing easily. For example GSA does not allow to use complex delta detection based queries for indexing data from databases easily.

3) Most of these products are not scalable horizontally. Apart from appliance solutions, products like endeca are resource intensive and not suitable for managing big data kind of volumes due to their scalability architecture.

4) Custom development for extending the product using APIs is not as easy as compared to open source products.

Custom search for applications is inevitable. Though the enterprise search platform may be dominated by these products, but for empowering custom applications that manage big data using specialized search functionality (for example ecommerce sites like amazon.com and others), products like elasticsearch and solr would continue to find its space.

The limitations with products like Elasticsearch is that it lacks the enterprise scale features for example OOB Crawlers, Information Visualization and Reporting layers required for e-discovery and reporting, and very limited taxonomy which is very crucial for an enterprise search platform. But as the product is still very young and evolving, these features can be expected hopefully over the couple of years.

Thursday, June 12, 2014

Elasticsearch with .NET : NEST Library Code Example

I'm reading: Elasticsearch with .NET : NEST Library Code ExampleTweet this !
Elasticsearch can be used with a number of programming languages, one of it being Microsoft .NET. Elasticsearch.NET (low level client) and NEST (high level client). 

NEST comes with a strongly typed wrapper around Elasticsearch.NET API, and allows for a fully object oriented programming approach to interface with Elasticsearch. It also has nice documentation to learn the APIs. 

The first program that I would want to generally write, is to index a structured document into elasticsearch using C# code and NEST APIs. One only needs any version of Visual Studio and NEST Nugget package installed. Below is the very first console application I wrote to test the .NET integration with Elasticsearch. Let me know whether you liked the code, whether it worked for you, and in case if you need any help with programming.


using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

using Nest;
using Nest.Domain.Connection;

namespace ESConsole
{
    class Program
    {
        static void Main(string[] args)
        {
            var uri = new Uri("http://localhost:9200");
            var settings = new ConnectionSettings(uri).SetDefaultIndex("contacts");
            var client = new ElasticClient(settings);
            

            if (client.Health(HealthLevel.Cluster).ConnectionStatus.Success)
            {
                Console.WriteLine("Connection Successful");
                
                if (client.IndexExists("contacts").Exists)
                {
                    Console.WriteLine("Index Exists");
                    Program.UpsertArticle(client, new Article("The Last Airbender", "Siddharth"), "blog", "article", 1);
                    Program.UpsertContact(client, new Contacts("Siddharth Mehta", "India"), "contacts", "contacts", 2);
                    Console.WriteLine("Data Indexed Successfully");
                }
                else
                {
                    Console.WriteLine("Index Does Not Exist");
                }
                
            }
            else
            {
                Console.Write("Connection Failed");
            }

            Console.ReadKey();

        }

        public class Article
        {
            public string title { get; set; }
            public string artist { get; set; }
            public Article(string Title, string Artist)
            {
                title = Title; artist = Artist;
            }
        }

        public class Contacts
        {
            public string name { get; set; }
            public string country { get; set; }
            public Contacts(string Name, string Country)
            {
                name = Name; country = Country;
            }
        }

        public static void UpsertArticle(ElasticClient client, Article article, string index, string type, int id)
        {            
            var RecordInserted = client.Index(article, index, type, id).Id;
                        
            if (RecordInserted.ToString() != "")
            {
                Console.WriteLine("Transaction Successful !");
            }
            else
            {
                Console.WriteLine("Transaction Failed");
            }
        }

        public static void UpsertContact(ElasticClient client, Contacts contact, string index, string type, int id)
        {
            var RecordInserted = client.Index(contact, index, type, id).Id;

            if (RecordInserted.ToString() != "")
            {
                Console.WriteLine("Transaction Successful !");
            }
            else
            {
                Console.WriteLine("Transaction Failed");
            }
        }
    }
}

Monday, June 09, 2014

Elasticsearch with SQL Server

I'm reading: Elasticsearch with SQL ServerTweet this !
Elasticsearch is a very powerful value addition to any relational dbms like SQL Server, Oracle, DB2 etc, provided it's used wisely. Before we look at how to use elasticsearch with SQL Server, we should look at "Why to use elasticsearch with SQL Server". This question holds the key to the answer.

SQL Server hold data either in relational form or in multi-dimensional form (through SSAS). Full Text Search (FTS) in SQL Server is capable of providing some out-of-box search feature, but when search queries requires exhaustive searching over huge datasets, and add some complexity in the search definition itself, one can evidently see performance impact there. Elasticsearch is primarily a search engine, but loaded with features like Facets and Aggregation framework, it helps solve many data analysis related problems. For example, everyone of us would have visited sites like Amazon.com, Ebay.com, Flipkart.com etc. Whenever we search for a product, it builds all the dynamic categories, ranges and values on the fly. For such features, a product like elasticsearch can be extremely helpful. One such real project example can be read from here.



How to use Elasticsearch with SQL Server ?


Elasticsearch JDBC River is the best means (to the best of my knowledge as of date) to load data from SQL Server into an elasticsearch index. One of the best explanations on setting up elasticsearch JDBC river with SQL Server, can be read from here.

One point to keep in view is that, if you setup a river and you restart elasticsearch server, the river would execute the query set for the river again. This could result in reloading of the entire data in the index. In case if the IDs are being fetched from the source, all existing records would get updates. But if IDs are autogenerated in elasticsearch, this would result in new records, which would ultimately lead to duplicate data. So use the river cautiously. You can also delete the river once data is loaded into the index, in case its a one time activity for one time data migration.
Related Posts with Thumbnails