Showing posts with label ISYS 532. Show all posts
Showing posts with label ISYS 532. Show all posts

Monday, April 23, 2007

Prepare to be amazed!

Actually for most of my family, you probably want to tune out of this post. This is a summary post for my ISys 532 class, Information Architecture. So unless web services and model-driven architecture is on your casual reading list, check back later and I’m sure Megan will have posted some extremely witty story. For those who dare, read on!

1. Understand enterprise information architecture so you can articulate a rational position on when and why various supporting methodologies, tools, and protocols are helpful or not helpful in a particular business environment. Major information architecture topics you should know well include the following:

This class has helped me gain the knowledge to move forward and gain a better understanding of what can be done and what is out there for enterprise systems. I liked how we had a general overview of many aspects of information architecture and hands-on assignments in many of them. Actually doing the work helps internalize the processes that we talked about in class.

a. Enterprise application integration

Enterprise systems are getting big, complex and contain terabytes of information. How do we get all that information under control? One simple phrase: Enterprise application integration. What may sound easy isn’t so easy in practice. Enterprise application integration is the concept that all systems can be interwoven through one single interface, to realize competitive advantages over other companies. Most of the time this is a very slow process, as large companies can have thousands of different programs, interfaces and functions (believe me, I worked at Wal-Mart last summer). Most major companies are working towards this goal or have already achieved this goal.

b. Service oriented architecture

I believe SOA is the architecture that will really take off in the years to come and it is already showing up in many organizations. This last week I had a BYU project manager interview for another class. One of his current projects was to implement SOA architecture here at BYU. If BYU is implementing SOA, then we know it has gone mainstream. BYU doesn’t adopt something unless they know it’s a best practice. SOA is a great idea that allows different functions of a system to be broken down into individual services, and these services are independent of one another. These services can then be mashed to together to perform the bigger overall functions. Different areas of the business can call these functions, pass in certain parameters specified by the service and receive a standard answer. SOA cuts down on redundant code and is platform independent.

c. Model-driven architecture and executable models

Kent and I did our paper on Model Driven Architecture. Model-driven architecture is the ability to create working programs by only completing the higher level data modeling. MDA provides great advantages in theory. Once you finish the data model, you have the ability to export code to any language that your tool is compatible with. After talking about MDA with numerous people, I found that MDA, at least among the younger generation, has a less than worthy reputation. Our generation doesn’t believe in MDA. We believe in the concept and think MDA is a good idea, but we don’t believe there is any tool out there that can complete the job from start to finish. I think we’re looking for a service that can produce the whole working system straight from our model, but the MDA programs aren’t currently capable of that. I also learned a lot about the abstraction of programming languages. Every generation of programming languages has gotten easier to understand by human minds. I believe MDA is where the world will move in the future but as of right now, there’s nothing capable of accomplishing the fully functional MDA services from start to finish.

d. Web services

Although there were many grumblings about the difficulty of the web services assignment, I was confident that I could tackle the web services assignment and succeed. Needless to say, the assignment took a lot longer than I anticipated. I wanted to write my web service in PHP, since that was the language I was using at work and was most familiar with and it seemed a lot easier than doing one in Java. There were several good tutorials on how to use PHP to write a web service. Having a PHP web service library abstracted a lot of the more confusing aspects of the web service. Creating a web service to handle one variable was very easy. Throwing in multiple variables of different types was a little trickier. By the end of the assignment, it was rewarding to refresh the page and know that you got this weather information from another web service somewhere out in the Internet. The concept of web services is fascinating. The notion of people creating hundreds of services out there for us to access, and the ability to access it in a like fashion, is where the internet is going. Google allows us to use their GoogleMaps API to access web services in their maps area. Programming a page to use a web service can take as little as five minutes, as showcased in www.ldstemplemaps.com.

2. Understand the key role of database management systems in enterprise information architecture and demonstrate the major elements of DBMS optimization, including DBMS server tuning and SQL query optimization.

The db optimization assignment really had me scared, but Swavek and I sat down and diligently started to chip away at the requirements. It is surprising how something as simple as an index can dramatically affect your query efficiency. Time after time we would test the query before and after and see incredible results. Well, savings of a few tenths of a second, but when you have millions of queries in an enterprise system, tenths of a second makes a huge difference. Afterwards, I reflected on what I had learned and how the assignment could help me at my current work. Right now at my work we’ve switched most of our tables to InnoDB, but we run into problems with tables that are used mostly for reading. InnoDB should be used for tables performing mostly updates and MyISAM should be used for tables performing mostly reads, I heard myself saying to my co-workers. Also, wherever you can make joins in the WHERE clause and not in the FROM statement. The WHERE is performed first and will limit the amount of rows you will actually join on. And lastly, after finishing the whole DB assignment, I admit it was actually a little bit of fun tuning the database.

3. Understand the importance of event logging and system monitoring within an enterprise-class system, and demonstrate how to put a logging/monitoring system in place.

When someone mentions logging and programming together in a sentence, a yawn is usually the most appropriate response. Dr. Liddle taught us otherwise. Logging is a very important function of any enterprise architecture. With million of things going on in your system, diagnosing a problem would be very difficult without a standardized way of dealing with errors. At my current workplace, on programmer in particular likes to put the error 2319 at many places in his code. If you’ve seen Monsters Inc., you’ll remember that number as the code for human contamination. Needless to say, that error doesn’t do a whole lot of good in explaining anything about the particular error. You might as well put nothing down! This is where logging would come in handy.

4. Enhance your teamwork and communication skills by delivering a solid group paper and an in-class presentation on an ISys 532 topic.

Kent and I had great expectations to jump on this paper from the get go and be finished with two weeks to spare. Needless to say, as always, other things popped up and we spent the last few days working vigorously to produce the paper and presentation. Overall, it was a good experience to delve more deeply into an information architecture topic. I enjoyed all of the presentations or snapshots of what is out there. Having presentations on lots of topics helped me choose which topics interest me and from there I can research further in those topics.

Conclusion

This class was great in that it exposed me to a lot of the different architectures to be found in any typical enterprise system. What I’ve realized is that there is a lot of information out there and I can’t begin to be an expert in everything or even a novice. But as a future manager and hopefully CIO, I’m expected to have a general surface knowledge of technologies that can and will help my company achieve better results. This class has taught me how to research these technologies and gain a better understanding of them through trial and error, reading articles and peer presentations.

Great class Dr. Liddle, keep up the good work!


Tuesday, April 17, 2007

DB Optimization Assignment

MRP Database

Complex and Simple Query Division

Below is our division of simple and complex queries. To make things more readable, we have included only the complex queries in this paper (Appendix A). The simple queries are all other queries not included in Appendix A.

In our division, we have mostly looked at typecasting, multiple joins, and nested queries. Queries which included any of these three elements were classified as complex.

Indexes

In general, indexes should be added to optimize the queries in two cases. First, indexes should be added whenever fields are used in the WHERE clause, especially to compute JOINs. Second, indexes should be added when fields are computed outside of the WHERE clause. An example would be SELECT Max(X) where the maximum function works in the select clause. The index on field X should speed things up. We assume that primary keys are indexed automatically, as this happened during our importing of the data into MySQL.

In accordance with the guidelines, the following indices should be added:

  • Activity.campID
  • Double index (order and reverse order) on adjacentcampground.groundID and adjacentcampground.adjacentTo (since this is a linking or associate table)
  • Agentstake.stakeID (this does not seem to be a part of composite primary key)
  • Agentstake.campID
  • Amenity.groundID
  • Announcement.campID
  • Double index (order and reverse order) on assignedto.stakeID and assignedto.campID (since this is a linking or associate table)
  • Campground.campID
  • Camppicture.campID
  • Camprule.catID
  • Camprulecategory.catID
  • Campsite.groundID
  • Made codes.code the PK
  • Direction.campID
  • Facility.campID
  • Made glossary.term the PK
  • Reservation.stake
  • Reservation.ward
  • Reservation.campID
  • Reservationdate.reservedate
  • Reservationdate.reserveID
  • Double index (order and reverse order) on reservationforcampsite.siteID and reservationforcampsite.reserveID (since this is a linking or associate table)
  • Ruletable.catID
  • Unavailabledate.campID
  • Made usertable.userID the PK
  • Double index (order and reverse order) on usercamp.campID and usercamp.userID (since this is a linking or associate table)
  • Made ward.wardID the PK
  • Ward.stakeID

After careful consideration, we selected the following complex queries to optimize:

Query Optimization 1

Before

SELECT announceID, announceTitle, announceText,

CAST(MONTH(annBeginDate) AS varchar) + '/' +

CAST(DAY(annBeginDate) AS varchar) + '/' +

CAST(YEAR(annBeginDate) AS varchar) annBeginDate,

CAST(MONTH(annEndDate) AS varchar) + '/' +

CAST(DAY(annEndDate) AS varchar) + '/' +

CAST(YEAR(annEndDate) AS varchar) annEndDate

FROM Announcement

WHERE campID='cid'

After

SELECT announceID, announceTitle, announceText, annBeginDate, annEndDate

FROM announcement

WHERE camped = 'cid'


This query suffers from several unnecessary complexities. First of all, casting the different parts of a date as a string does not make sense in this case, as it's reassembled in the exact format of the datetime data type. Thus, all the CASTing can be removed to improve query speed. Additionally, an index on Announcement.campID should be added.

This cuts the number of rows returned from several (e.g. 21 for campID=11) to 1. With the casting, we couldn't get the query to run at all despite an hour of research. After the casting was removed, the query took 0.0224 seconds uncached, 0.0003 cached. After the index was added, the query took .0011 seconds uncached and .0003 uncached. Since this table does not change a lot, this optimization actually does not help the database performance tremendously, because after the query is cached, the time to execute it is the same. The optimization does, however, show the principles of proper query design.

Query Optimization 2

Before

SELECT r.reservid, r.reservedBy, r.stake, r.ward, r.groupType,

CAST(r.startDate AS DateTime) AS arrival,

CAST(r.endDate AS DateTime) AS departure,

(SELECT stakeName FROM Stake s

WHERE s.stakeID=r.stake) AS stakeName,

(SELECT wardName FROM Ward w

WHERE CAST(w.wardID AS NVARCHAR)=r.ward) AS wardName

FROM Reservation r

INNER JOIN ReservationForCampsite rfc ON (r.reservid=rfc.reservid)

WHERE ((CAST(r.startDate AS DateTime) >= 'firstDate' AND

CAST(r.startDate AS DateTime) <= 'lastDate') OR

(CAST(r.endDate AS DateTime) >= 'firstDate' AND

CAST(r.endDate AS DateTime) <= 'lastDate')) AND r.campID=cid AND rfc.siteID=sid

ORDER BY departure

After

SELECT r.reservid, r.reservedBy, r.stake, r.ward, r.groupType, r.startDate AS arrival, r.endDate AS departure,

(SELECT stakeName FROM Stake s WHERE s.stakeID = r.stake ) AS stakeName,

(SELECT wardName FROM Ward w WHERE w.wardID = r.ward) AS wardName

FROM Reservation r, ReservationForCampsite rfc

WHERE r.reservid = rfc.reservid AND

((r.startDate >= 'firstDate' AND r.startDate <= 'lastDate') OR (r.endDate >= 'firstDate' AND r.endDate <= 'lastDate'))

AND r.campID=cid AND rfc.siteID=sid

ORDER BY departure


To begin the optimization, we took out the casting and changed all the varchars to datetimes, which is what we think they should be in the first place. Again, we could not get the original query to execute, so we don't know exactly how much time was gained through this change. Having seen optimization in practice before, however, we feel confident this change speeds up the query. We then changed the ward field by first replacing "all" with null and then changing the field type to int to remove the need to cast w.wardid as a varchar (now null became a special value to mean "all").

We indexed reservation.ward, changed reservation.stake to int(11) from varchar, then indexed reservation.stake as well. In the reservationforcampsite, an association (linking) table, we created two indexes for the two fields, rfc.siteID and rfc.reserveID. The indexes contained both fields, each index in different order.

When the query was first run, before the indexing, it took .0392 seconds uncached and .0004 seconds cached. After the indexing the query took .0030 seconds uncached and .0004 seconds cached.

Query Optimization 3

Before

SELECT s.siteID, s.siteName, s.siteCapacity, g.groundName, r.reservID

FROM Campsite s

INNER JOIN Campground g ON (s.groundID=g.groundID)

INNER JOIN Camp c ON (g.campID=c.campID)

LEFT JOIN ReservationForCampsite r ON (r.reservID=" + editMode AND r.siteID=s.siteID)

WHERE c.campID=cid AND c.campID=g.campID AND g.groundID=s.groundID

ORDER BY g.groundName, s.siteName

After

SELECT s.siteID, s.siteName, s.siteCapacity, g.groundName, r.reservID

FROM Campsite s, Campground g, Camp c, ReservationForCampsite r

WHERE c.campID=cid AND c.campID=g.campID AND g.groundID=s.groundID AND r.siteID = s.siteID

ORDER BY g.groundName, s.siteName


Before we started on the optimization, we decided to remove the mysterious 'r.reservID=" + editMode AND' from the FROM clause. This would actually cause the query to break, as the lone double quote would completely throw it off. So, part of the optimizing sometimes involves getting things to work. After this small change, we ran the query, which completed in .0473 seconds uncached and .0003 cached.

The next step involved removing the inner joins from the FROM clause, as the data is joined in the WHERE clause anyway. This improvement alone brought the query execution time to .0024 seconds uncached, and .0003 cached.

Next, we created indexes on campground.campID, campsite.groundID. There were already indexes in the reservationforcampsite table, so we didn't need to create them. All these changes brought down the query execution time to .0016 uncached, and .0003 cached.

Database Schema Changes

The main changes to the schema we recommend are data type changes. The data types in the beginning database are often not ideal for the data they store. In addition to the datatype changes already made for the query optimization, the following datatype changes should be made:

  • Reservationdate.reservdate from varchar to date
  • Unavailabledate.unavailDate from varchar to date
  • Reservation.startDate and endDate from varchar to date

Additionally, the following changes can be made to improve the schema:

  • The tables priority schedule and nextpriorityschedule should be combined, because the data would not lose any integrity, and logically nextpriorityschedule is simply a continuation of priorityschedule. Basically, currently we have horizontal fragmentation of a table with 12 rows, and that's just cruel.

Changes vs. Improvements

Whether the changes to the database are actually beneficial needs to be carefully measured. Specifically, changes made need to increase the efficiency of the database as measured by the time queries take to complete. If a change to a table increases the speed of the queries operating on that table, it is a good indication that the changes are really improvements.

We have sampled the complex queries optimized above and they all yielded improvements. Even though we have not measured every single index and datatype change, we have no reason to suspect they would yield a different result.

Scriptures Database

Index Changes

Given the database schema and the queries, there are several unnecessary indexes. Indexes are meant to speed up queries, and they shouldn't exist if they don't serve that purpose, because indexes also slow down transactions during updates. This is the underlying motivation for the JODDB index changes.

The general recommendation is to remove all indexes on fields which are not in the WHERE clauses or computation functions of queries. For example, the 'speaker' table contains three indexes which are not tied to any WHERE clauses or computation functions of queries. Therefore, each of these indexes could be removed. Additionally, the "LastName_2" is an index of two columns, which can also be used as an index for just one of the columns, making the "LastName" index unnecessary. Without listing every single index, suffice it to say that these two general tests ((1)indexes used in WHERE clauses and (2) index redundancy) should be implemented across the whole database.

Recommendations

The recommendations above to delete unnecessary indexes would actually increase database performance for updates as well as decrease database size, both of which are part of optimization.

Tuesday, April 3, 2007

Log4J vs. JMS

To start off I'll give a brief description of each topic.

Log4J - is a java utility that allows a developer to turn logging off and on at runtime. Logging statements can be a resource consuming task and could take up thousands of lines of code. Logging can also slow down performance because of all the extra code. Log4J seeks to minimize the resource usage. Logging can be a very important tool, not only for error messaging but also for debugging purposes. Log4J uses inheritance to help control the message levels that are outputted.

JMS - This topic is a little more complicated than Log4J. As far as I can tell JMS is way for your java program to pass certain types of messages from one server to another.
  • Publish-Subscribe - When a single message needs to be received by multiple applications. (Seems like this could also be accomplished through web services)
  • Point-to-Point - When one application is sending a direct message to another application
  • Request-Reply - When an application sends a message and is waiting for a message in return
For more information on JMS, http://my.execpc.com/~gopalan/jms/jms.html.

JMS seems like more of a way to pass information from one application to another, much like web services. Log4J is used in one application for logging and debugging purposes. There are a lot of different versions of Log4J, including all of the major programming languages. On the other hand JMS is only for use with Java. I'm not exactly sure how Log4J and JMS will fit together. I'll have to wait for class tomorrow.

Monday, February 19, 2007

Web Services Assignment

1. The web service wasn't as hard to implement as I had imagined. I hadn't used my Java skills since the junior core and really wasn't looking forward to shaking the dust off my Eclipse. I decided to go with PHP, the language of choice at my work. I discovered that it was relatively simple to create and consume web services with PHP. There was a bunch of libraries out there for me to choose from. At first I didn't realize that I there were multiple libraries, but as I was reading tutorials about the PEAR soap implementation and finding they weren't working, I realized that I was using the nusoap library. Once I got past that hurdle, I was able to find several good resources, such as: Programming with NuSoap and Creating and Consuming Web Services with PHP. The most trouble I had was returning multiple variables as an output. My web service would only allow me to return only one variable and I finally figured out how to pass back an array filled with the weather information as my one output variable.

2. The wsdl of my service can be accessed at http://matt.circlepix.com/scripts/WebService/soapServer.php in a cool web services format. The wsdl can also be accessed by attaching ?wsdl to the address. Here is the sample code needed to access my weather web service:

<?php

//Get the Nusoap Library
require_once('nusoap.php');

//Create the client instance
$client = new soapclient(
'http://matt.circlepix.com/scripts/WebService/soapServer.php?wsdl', true);


//Testing variables
//$zipcode= 'provo';
//$zipcode= 'orem';
//$zipcode= '84057';
$zipcode= '84602';

//Call the soap method
$result = $client->call('getWeather', array('zipcode' => $zipcode));

//Print the result
print_r($result);

?>

3. Web services give you the ability to not rewrite code that has already been written. Whenever you want to do that same thing or anyone else for that matter, all they have to do is call that same function, with the same parameters and instantly you have what you want. Once you get the hang of creating a web service, I don't think it would be that hard to implement many of them. I believe web services separate the business logic out of the business objects and creates another layer to access. It would be difficult to integrate a web services framework into an already functioning system. Why use web services and access a service across the network, when you can create functions within your own system and include the other pages right into your code? It seems like it would be faster and the delivery would be less difficult, to use code that you've already written but use a different method to transfer it. Web services is also based upon standards that haven't been finalized. What happens if you base your whole company on RPC standard web services and then everybody adopts REST web services? I think there is a case to use web services to fill little niches in your code, such as a weather service portlet. As far as creating the whole system as a web service; I don't think that would be a good idea for any company who has already invested deeply in building their own information architecture.

4. I think Service Oriented Architecture (SOA) can essentially be found in any company in some form. We create common functions that will be used many times and put them in common files (i.e. database connection logic). We access these files at the beginning of other files and call these functions when needed. Are we not calling a service, to help us implement something? The only difference is the way we access the information. Accessing other functions the web services way, over the Internet, will significantly increase the time and bandwidth that a company uses up. One case where web services could be of use, is in a dashboard setting. If you want to create a home page filled with portlets of different kinds, a company could call different web services, such as weather, clock or news aggregators, creating a mashup of different web services. Whether or not SOA is the future; I think only time will tell.

Monday, January 22, 2007

Information Integration

In the rapidly changing business world of today, the importance of information has increased the need for better information management tools. Almost all major companies are driven by information, including Google, Amazon and Wal-Mart. Whoever controls the information, has the power. So it is not surprising that IBM has come out with their Websphere Information Integration platform and their Information Management software. The articles contain a brief overview of how the process works, but I would like to see a company that has information problems and see how bringing in the IBM information server would change things. I think every company is in need of controlling their information more efficiently. This is the Information Age and Information is power.

Saturday, January 20, 2007

Web Services

Since my family also reads all of the posts for my class, I though I would start off with the definition of web services. A service provided by a website that can be accessed by anyone as long as they know how use the proper protocols to obtain the service. I was excited when we started talking about them in class, because my friend Nate Skousen and I were starting to develop a website that used Google's web service, Google Maps. The website address is www.ldstemplemaps.com. We want to make it the first stop for people trying to obtain information about LDS temples and we want to eventually expand to LDS places. It took us about two hours total to setup everything up. Web services are changing the web. It's so easy to do something like this with anything. I was thinking it would be a lot of fun for a family website. People could add anything they wanted to the map, simply by putting in a name, picture and address. You could create a personal Anderton family map with our own memory places. Hope that wasn't too technical family!

Wednesday, January 10, 2007

ISYS 532

When I hear the words enterprise application integration or information architecture, I don't think it has quite the same buzz factor as say AJAX or web analytics. I want to have this course bring to life these words and make them exciting for me. How can I use them in my daily work? Why should they be important to me? Based upon the first two days, I think we're on our way to that goal. My information architecture knowledge is limited and I want to come out of this class being able to design an enterprise information infrastructure.

I started out programming in Java. Recently, I started a new job, which has allowed me to learn PHP and lots of javascript. I also had a chance to program an AJAX application, which was really fun. This summer I had the chance to work on the Wal-Mart Windows Server Engineering Team, administering over 25,000 windows servers. The first day I was installing $2,000 fiber-optic network cards, never having even opened a server before. I was nervous!