MySQL – Wonderous Ponder

Hi folks,

In this post I talk about a few things you may find interesting about Oracle if you are a software developer with MySQL background getting started with Oracle.

I’ve known about Oracle for a long time. Long ago I remember seeing jobs for Oracle DBAs that required Oracle certification.

It seemed like an interesting software to know how to use but there wasn’t a strong push to learn it.

While developing applications professionally, MySQL was generally the database of choice. Later I ran into Postgres in the course of professional work, but not Oracle.

Eventually I got a back-end developer role in a large enterprise and then I met Oracle properly for the first time.

Here are some thoughts on Oracle, looking at it from the perspective of someone with MySQL background.

You’re likely to run into Oracle in large enterprise where lots of data are handled. Not so much in small to medium-scale enterprises. This is because Oracle has lots of powerful features that are targeted towards large enterprises.

For the most part you can write queries using the same old SQL on both MySQL and Oracle databases. SELECTs, JOINS, subqueries work in the same manner. In Oracle, subqueries are called inline views.

Tools: With mysql you use tools like phpMyAdmin and MySQL Workbench. With Oracle the popular clients I’ve come across are Oracle SQL developer and Toad.

When querying data with Oracle via SELECT statement, you can use multiple servers by using the ‘parallel’ keyword e.g. SELECT /*+ parallel */ FOO FROM SOMEDB.SOMETABLE. MySQL doesn’t have that concept.

The way you apply limits and offsets are different. In MYSQL, you can use something like

SELECT ….. LIMIT 10;

But in Oracle, you write

SELECT … FETCH FIRST 10 ROWS ONLY;

In Oracle, when you want to compare values that could possibly be null, you use the NVL function

e.g.

SELECT FOO FROM table1
JOIN table2 on
NVL(table1.some_col, ‘IsNul’) = NVL(table2.some_col, ‘IsNul’)

In terms of connecting to MySQL from your programming environment, if you use Python, then Oracle provides a very good library called cx_Oracle. It provides similar abstractions like you would get with MySQL connector. So you have your cursor which you can use to run queries, etc.

If you use a Mac and need to work with Oracle locally, you would need to use a virtual machine via tools like VirtualBox. With MySQL, you can install it directly on a Mac.

Furthermore, the COMMIT statement is very important in Oracle. You have to commit transactions to persist changes to the database, e.g. creating a table, deleting rows, etc. Usually in MySQL, you just delete rows and that’s it, the change persists.

So, in general, if you’re comfortable with other databases like MySQL or Postgres, you should feel at home with Oracle. Oracle has some peculiarities, but you should be just fine.

That’s it for now. Till next time, happy software development.

As a web developer you may run into occasions where your database struggles to handle search queries. Have you ever been in a situation where you search for a query containing a few keywords and your website completely hangs? Not a very good feeling. This kind of issue happens usually on websites with at least thousands or even hundreds of thousands of rows and where search involves accessing several db tables. Today I’m going to discuss a solution to speed up search on a db. Have you ever searched on your website for something and get rather irrelevant results? The tool I’m going to discuss solves that problem as well. It’s called Sphinx. Sphinx is a powerful search engine written in C++ and allows blazing fast search ranked by relevance, much more than a traditional database can handle. Sphinx is open source and also available under commercial license. I’ll talk about how to set up Sphinx and get started with it on osX.

First, let’s prepare some data to work with. We assume our website is for car catalogs. Note: I got this sample data and PHP file courtesy of a very good Sphinx tutorial at http://www.ibm.com/developerworks/library/os-php-sphinxsearch/

Create a database testdb and add a few tables to it using the following SQL statements:

CREATE TABLE IF NOT EXISTS `Assembly` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `label` varchar(7) NOT NULL,
  `description` varchar(128) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=8 ;

--
-- Dumping data for table `Assembly`
--

INSERT INTO `Assembly` (`id`, `label`, `description`) VALUES
(1, '5-00', 'Seats'),
(2, '4-00', 'Electrical'),
(3, '3-00', 'Glasses'),
(4, '2-00', 'Frame'),
(5, '1-00', 'Engine'),
(7, '101-00', 'Accessories');

-- --------------------------------------------------------
--
-- Table structure for table `Inventory`
--

CREATE TABLE IF NOT EXISTS `Inventory` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `partno` varchar(32) NOT NULL,
  `description` varchar(256) NOT NULL,
  `price` float unsigned NOT NULL DEFAULT '0',
  PRIMARY KEY (`id`),
  UNIQUE KEY `partno` (`partno`) USING BTREE
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=10 ;

--
-- Dumping data for table `Inventory`
--

INSERT INTO `Inventory` (`id`, `partno`, `description`, `price`) VALUES
(1, 'WIN408', 'Portal window', 423),
(2, 'ACC711', 'Jack kit', 110),
(3, 'ACC43', 'Rear-view mirror', 55),
(4, 'ACC5409', 'Cigarette lighter', 20),
(5, 'WIN958', 'Windshield, front', 500),
(6, '765432', 'Bolt', 0.1),
(7, 'ENG001', 'Entire engine', 10000),
(8, 'ENG088', 'Cylinder head', 55),
(9, 'ENG976', 'Large cylinder head', 65);

-- --------------------------------------------------------

--
-- Table structure for table `Model`
--

CREATE TABLE IF NOT EXISTS `Model` (
  `id` int(10) unsigned NOT NULL AUTO_INCREMENT,
  `label` varchar(100) NOT NULL,
  `description` varchar(256) NOT NULL,
  `begin_production` int(4) NOT NULL,
  `end_production` int(4) NOT NULL,
  PRIMARY KEY (`id`)
) ENGINE=InnoDB  DEFAULT CHARSET=utf8 AUTO_INCREMENT=8 ;

--
-- Dumping data for table `Model`
--

INSERT INTO `Model` (`id`, `label`, `description`, `begin_production`, `end_production`) VALUES
(1, 'X Sedan', 'Four-door performance sedan', 1998, 1999),
(3, 'X Sedan', 'Four door performance sedan, 1st model year', 1995, 1997),
(4, 'J Convertible', 'Two-door roadster, metal retracting roof', 2002, 2005),
(5, 'J Convertible', 'Two-door roadster', 2000, 2001),
(7, 'W Wagon', 'Four-door, all-wheel drive sport station wagon', 2007, 0);

-- --------------------------------------------------------

--
-- Table structure for table `Schematic`
--

CREATE TABLE Schematic (
  id int(10) unsigned NOT NULL auto_increment,
  partno_id int(10) unsigned NOT NULL,
  assembly_id int(10) unsigned NOT NULL,
  model_id int(10) unsigned NOT NULL,
  PRIMARY KEY (id),
  KEY partno_index USING BTREE (partno_id),
  KEY assembly_index USING BTREE (assembly_id),
  KEY model_index USING BTREE (model_id),
  FOREIGN KEY (partno_id) REFERENCES Inventory(id),
  FOREIGN KEY (assembly_id) REFERENCES Assembly(id),
  FOREIGN KEY (model_id) REFERENCES Model(id)
) ENGINE=InnoDB;

--
-- Dumping data for table `Schematic`
--

INSERT INTO `Schematic` (`id`, `partno_id`, `assembly_id`, `model_id`) VALUES
(1, 6, 5, 1),
(2, 8, 5, 1),
(3, 1, 3, 1),
(4, 5, 3, 1),
(5, 8, 5, 7),
(6, 6, 5, 7),
(7, 4, 7, 3),
(8, 9, 5, 3);

-- --------------------------------------------------------

--
-- Structure for view `catalog`
--

CREATE OR REPLACE VIEW Catalog AS
SELECT
  Inventory.id,
  Inventory.partno,
  Inventory.description,
  Assembly.id AS assembly,
  Model.id AS model
FROM
  Assembly, Inventory, Model, Schematic
WHERE
  Schematic.partno_id=Inventory.id 
  AND Schematic.model_id=Model.id 
  AND Schematic.assembly_id=Assembly.id;

A bit of explanation of the tables. Assembly tables is for storing various modules of a car. Inventory stores all the parts we have in store. Model stores the car model. Schematic stores information required to put together a car of a given model with a specific part and for a specific assembly. Finally there’s a view called catalog which brings together the fields we need to serve as source to Sphinx, including the inventory id, part number, description, assembly id and model id.

If we inspect catalog the contents should look like:

+----+---------+---------------------+----------+-------+
| id | partno  | description         | assembly | model |
+----+---------+---------------------+----------+-------+
|  6 | 765432  | Bolt                |        5 |     1 |
|  8 | ENG088  | Cylinder head       |        5 |     1 |
|  1 | WIN408  | Portal window       |        3 |     1 |
|  5 | WIN958  | Windshield, front   |        3 |     1 |
|  4 | ACC5409 | Cigarette lighter   |        7 |     3 |
|  9 | ENG976  | Large cylinder head |        5 |     3 |
|  8 | ENG088  | Cylinder head       |        5 |     7 |
|  6 | 765432  | Bolt                |        5 |     7 |
+----+---------+---------------------+----------+-------+

Next, install Sphinx by downloading the latest release version from http://sphinxsearch.com/downloads/release/. At the moment the latest is 2.2.9. It takes the form of a tar.gz archive.

Extract the contents from the archive and This will give you a folder like sphinx-2.2.9-release-osx10.10-x86_64

Move the folder to /usr/local/sphinx

Next we need to create a Sphinx config file. Change to /usr/local/sphinx and copy the sample config fiel sphinx-min.conf.in to sphinx.conf.

Next, open sphinx.conf and edit it to contain the following contents:

source catalog 
{
	type			= mysql

	sql_host		= localhost
	sql_user		= testdb
	sql_pass		= testdb
	sql_db			= testdb
	sql_port		= 3306	# optional, default is 3306

        sql_query               = \
            SELECT \
                id, partno, description, \
                assembly, model \
            FROM \
                Catalog;
        sql_attr_uint        = assembly
        sql_attr_uint        = model
}
    
index catalog 
{
    source              = catalog
    path                = /usr/local/sphinx/data/catalog
    min_word_len        = 3
    min_prefix_len      = 0
    min_infix_len       = 3
}

indexer
{
	mem_limit		= 128M
}


searchd
{
	listen			= 9312
	listen			= 9306:mysql41
	log			= /usr/local/sphinx/log/searchd.log
	query_log		= /usr/local/sphinx/log/query.log
	read_timeout		= 5
	max_children		= 30
	pid_file		= /usr/local/sphinx/log/searchd.pid
        max_matches             = 1000
	seamless_rotate		= 1
	preopen_indexes		= 1
	unlink_old		= 1
	workers			= threads # for RT to work
	binlog_path		= /usr/local/sphinx/data
}

Save and close the file.

Some explanation. The source section gives Sphinx information about your database. I use a MySQL db in this tutorial. sql_query is the main document fetch query which Sphinx uses to grab data from our db. sql_attr_uint is used to specify unsigned integers. The index section says how to index value. Update its source parameter to your desired value. The searchd section is for the search daemon

Now configuration is complete. To make command line work easier, add /usr/local/sphinx/bin/ to your PATH.

Next, index the data into Sphinx. Run the following command:

indexer --config /usr/local/sphinx/sphinx.conf catalog

You should see some text appear that ends with something like

total 8 docs, 149 bytes
total 0.062 sec, 2397 bytes/sec, 128.71 docs/sec
total 4 reads, 0.000 sec, 0.1 kb/call avg, 0.0 msec/call avg
total 12 writes, 0.000 sec, 0.1 kb/call avg, 0.0 msec/call avg

That means indexing is complete. We can now run queries against Sphinx. First, we need to start the search daemon using the command:

sudo searchd -c /usr/local/sphinx/sphinx.conf

Note: to stop the daemon run the command:

sudo searchd --config /usr/local/sphinx/sphinx.conf --stop

Now we are going to write a PHP file that gets data from Sphinx.

Create a file called query_cat.php and add the following content to it:



include('/usr/local/sphinx/api/sphinxapi.php');

$cl = new SphinxClient();
$cl->SetServer('localhost', 9312);
$cl->SetMatchMode(SPH_MATCH_ANY); // match any word of the query
$cl->SetFilter('model', array(3)); // filter results to those where model is 3

$result = $cl->Query('Windshield', 'catalog'); // query the word Windshield from the index called catalog.

if ($result === false){
    echo 'Query failed: ' . $cl->GetLastError()."\n";
} else {
    if ($cl->GetLastWarning()){
        echo "Warning: ". $cl->GetLastWarning()."\n";
    }
    if ( ! empty($result["matches"]) ) {
        foreach ( $result["matches"] as $doc => $docinfo ) {
            echo "$doc\n";
        }
      
        print_r( $result );
    }

}

You will get an output like

9
Array
(
    [error] => 
    [warning] => 
    [status] => 0
    [fields] => Array
        (
            [0] => partno
            [1] => description
        )

    [attrs] => Array
        (
            [assembly] => 1
            [model] => 1
        )

    [matches] => Array
        (
            [9] => Array
                (
                    [weight] => 1
                    [attrs] => Array
                        (
                            [assembly] => 5
                            [model] => 3
                        )

                )

        )

    [total] => 1
    [total_found] => 1
    [time] => 0.001
    [words] => Array
        (
            [cylind] => Array
                (
                    [docs] => 2
                    [hits] => 2
                )

        )

)

A warning may appear above the message saying that calling a method isn’t recommended, but you can ignore it.

As you can see from the result, 9 is the single result and is indeed the only inventory id for model id 3 containing the keyword ‘container’. Refer to the printout of catalog above.

That’s it for now. There are many other options that can be explored as you play around with Sphinx. Also, there’s a so-called SphinxQL which can be use to query Sphinx instead of using the API.

Category: MySQL

A few thoughts on Oracle viewed from a MySQL background

How to fix MySQL Error: The server quit without updating PID file

Getting Started with Sphinx Search

MySQL Query Logging in Ubuntu