Monday, April 23, 2007
Wrapping Up
Prepare to be amazed!
Actually for most of my family, you probably want to tune out of this post. This is a summary post for my ISys 532 class, Information Architecture. So unless web services and model-driven architecture is on your casual reading list, check back later and I’m sure Megan will have posted some extremely witty story. For those who dare, read on!
1. Understand enterprise information architecture so you can articulate a rational position on when and why various supporting methodologies, tools, and protocols are helpful or not helpful in a particular business environment. Major information architecture topics you should know well include the following:
This class has helped me gain the knowledge to move forward and gain a better understanding of what can be done and what is out there for enterprise systems. I liked how we had a general overview of many aspects of information architecture and hands-on assignments in many of them. Actually doing the work helps internalize the processes that we talked about in class.
a.
b. Service oriented architecture
I believe SOA is the architecture that will really take off in the years to come and it is already showing up in many organizations. This last week I had a BYU project manager interview for another class. One of his current projects was to implement SOA architecture here at BYU. If BYU is implementing SOA, then we know it has gone mainstream. BYU doesn’t adopt something unless they know it’s a best practice. SOA is a great idea that allows different functions of a system to be broken down into individual services, and these services are independent of one another. These services can then be mashed to together to perform the bigger overall functions. Different areas of the business can call these functions, pass in certain parameters specified by the service and receive a standard answer. SOA cuts down on redundant code and is platform independent.
c. Model-driven architecture and executable models
Kent and I did our paper on Model Driven Architecture. Model-driven architecture is the ability to create working programs by only completing the higher level data modeling. MDA provides great advantages in theory. Once you finish the data model, you have the ability to export code to any language that your tool is compatible with. After talking about MDA with numerous people, I found that MDA, at least among the younger generation, has a less than worthy reputation. Our generation doesn’t believe in MDA. We believe in the concept and think MDA is a good idea, but we don’t believe there is any tool out there that can complete the job from start to finish. I think we’re looking for a service that can produce the whole working system straight from our model, but the MDA programs aren’t currently capable of that. I also learned a lot about the abstraction of programming languages. Every generation of programming languages has gotten easier to understand by human minds. I believe MDA is where the world will move in the future but as of right now, there’s nothing capable of accomplishing the fully functional MDA services from start to finish.
d. Web services
Although there were many grumblings about the difficulty of the web services assignment, I was confident that I could tackle the web services assignment and succeed. Needless to say, the assignment took a lot longer than I anticipated. I wanted to write my web service in PHP, since that was the language I was using at work and was most familiar with and it seemed a lot easier than doing one in Java. There were several good tutorials on how to use PHP to write a web service. Having a PHP web service library abstracted a lot of the more confusing aspects of the web service. Creating a web service to handle one variable was very easy. Throwing in multiple variables of different types was a little trickier. By the end of the assignment, it was rewarding to refresh the page and know that you got this weather information from another web service somewhere out in the Internet. The concept of web services is fascinating. The notion of people creating hundreds of services out there for us to access, and the ability to access it in a like fashion, is where the internet is going. Google allows us to use their GoogleMaps API to access web services in their maps area. Programming a page to use a web service can take as little as five minutes, as showcased in www.ldstemplemaps.com.
2. Understand the key role of database management systems in enterprise information architecture and demonstrate the major elements of DBMS optimization, including DBMS server tuning and SQL query optimization.
The db optimization assignment really had me scared, but Swavek and I sat down and diligently started to chip away at the requirements. It is surprising how something as simple as an index can dramatically affect your query efficiency. Time after time we would test the query before and after and see incredible results. Well, savings of a few tenths of a second, but when you have millions of queries in an enterprise system, tenths of a second makes a huge difference. Afterwards, I reflected on what I had learned and how the assignment could help me at my current work. Right now at my work we’ve switched most of our tables to InnoDB, but we run into problems with tables that are used mostly for reading. InnoDB should be used for tables performing mostly updates and MyISAM should be used for tables performing mostly reads, I heard myself saying to my co-workers. Also, wherever you can make joins in the WHERE clause and not in the FROM statement. The WHERE is performed first and will limit the amount of rows you will actually join on. And lastly, after finishing the whole DB assignment, I admit it was actually a little bit of fun tuning the database.
3. Understand the importance of event logging and system monitoring within an enterprise-class system, and demonstrate how to put a logging/monitoring system in place.
When someone mentions logging and programming together in a sentence, a yawn is usually the most appropriate response. Dr. Liddle taught us otherwise. Logging is a very important function of any enterprise architecture. With million of things going on in your system, diagnosing a problem would be very difficult without a standardized way of dealing with errors. At my current workplace, on programmer in particular likes to put the error 2319 at many places in his code. If you’ve seen Monsters Inc., you’ll remember that number as the code for human contamination. Needless to say, that error doesn’t do a whole lot of good in explaining anything about the particular error. You might as well put nothing down! This is where logging would come in handy.
4. Enhance your teamwork and communication skills by delivering a solid group paper and an in-class presentation on an ISys 532 topic.
Kent and I had great expectations to jump on this paper from the get go and be finished with two weeks to spare. Needless to say, as always, other things popped up and we spent the last few days working vigorously to produce the paper and presentation. Overall, it was a good experience to delve more deeply into an information architecture topic. I enjoyed all of the presentations or snapshots of what is out there. Having presentations on lots of topics helped me choose which topics interest me and from there I can research further in those topics.
Conclusion
Great class Dr. Liddle, keep up the good work!
Thursday, April 19, 2007
For the Love of the Game



Today was a reading day before finals so naturally we didn't study! Matt had a softball game tonight and let me tell you, he is one good ballplayer! I (the dedicated wife) braved the freezing weather-yes it's freezing and it's April!- and went out to cheer him on. Here are the stats: he was 4-4 with three runs scored, batting leadoff. What a champ! He made some great hits, catches and slides into home base...and sacrificed his body doing it, literally. The ball bounced off another guys glove and hit Matt in the "you know where" AND he sliced his leg upon sliding into home. His leg was dripping blood after it happened but these pictures don't really do it justice. So of coarse he was limping a little after the game. Despite all the sacrifice they lost 15 -20. This is a huge improvement for the team considering last weeks score was 16-0. GOOOO Circlepix!
Tuesday, April 17, 2007
DB Optimization Assignment
MRP Database
Complex and Simple Query Division
Below is our division of simple and complex queries. To make things more readable, we have included only the complex queries in this paper (Appendix A). The simple queries are all other queries not included in Appendix A.
In our division, we have mostly looked at typecasting, multiple joins, and nested queries. Queries which included any of these three elements were classified as complex.
Indexes
In general, indexes should be added to optimize the queries in two cases. First, indexes should be added whenever fields are used in the WHERE clause, especially to compute JOINs. Second, indexes should be added when fields are computed outside of the WHERE clause. An example would be SELECT Max(X) where the maximum function works in the select clause. The index on field X should speed things up. We assume that primary keys are indexed automatically, as this happened during our importing of the data into MySQL.
In accordance with the guidelines, the following indices should be added:
- Activity.campID
- Double index (order and reverse order) on adjacentcampground.groundID and adjacentcampground.adjacentTo (since this is a linking or associate table)
- Agentstake.stakeID (this does not seem to be a part of composite primary key)
- Agentstake.campID
- Amenity.groundID
- Announcement.campID
- Double index (order and reverse order) on assignedto.stakeID and assignedto.campID (since this is a linking or associate table)
- Campground.campID
- Camppicture.campID
- Camprule.catID
- Camprulecategory.catID
- Campsite.groundID
- Made codes.code the PK
- Direction.campID
- Facility.campID
- Made glossary.term the PK
- Reservation.stake
- Reservation.ward
- Reservation.campID
- Reservationdate.reservedate
- Reservationdate.reserveID
- Double index (order and reverse order) on reservationforcampsite.siteID and reservationforcampsite.reserveID (since this is a linking or associate table)
- Ruletable.catID
- Unavailabledate.campID
- Made usertable.userID the PK
- Double index (order and reverse order) on usercamp.campID and usercamp.userID (since this is a linking or associate table)
- Made ward.wardID the PK
- Ward.stakeID
After careful consideration, we selected the following complex queries to optimize:
Query Optimization 1
Before
SELECT announceID, announceTitle, announceText,
CAST(MONTH(annBeginDate) AS varchar) + '/' +
CAST(DAY(annBeginDate) AS varchar) + '/' +
CAST(YEAR(annBeginDate) AS varchar) annBeginDate,
CAST(MONTH(annEndDate) AS varchar) + '/' +
CAST(DAY(annEndDate) AS varchar) + '/' +
CAST(YEAR(annEndDate) AS varchar) annEndDate
FROM Announcement
WHERE campID='cid'
After
SELECT announceID, announceTitle, announceText, annBeginDate, annEndDate
FROM announcement
WHERE camped = 'cid'
This query suffers from several unnecessary complexities. First of all, casting the different parts of a date as a string does not make sense in this case, as it's reassembled in the exact format of the datetime data type. Thus, all the CASTing can be removed to improve query speed. Additionally, an index on Announcement.campID should be added.
This cuts the number of rows returned from several (e.g. 21 for campID=11) to 1. With the casting, we couldn't get the query to run at all despite an hour of research. After the casting was removed, the query took 0.0224 seconds uncached, 0.0003 cached. After the index was added, the query took .0011 seconds uncached and .0003 uncached. Since this table does not change a lot, this optimization actually does not help the database performance tremendously, because after the query is cached, the time to execute it is the same. The optimization does, however, show the principles of proper query design.
Query Optimization 2
Before
SELECT r.reservid, r.reservedBy, r.stake, r.ward, r.groupType,
CAST(r.startDate AS DateTime) AS arrival,
CAST(r.endDate AS DateTime) AS departure,
(SELECT stakeName FROM Stake s
WHERE s.stakeID=r.stake) AS stakeName,
(SELECT wardName FROM Ward w
WHERE CAST(w.wardID AS NVARCHAR)=r.ward) AS wardName
FROM Reservation r
INNER JOIN ReservationForCampsite rfc ON (r.reservid=rfc.reservid)
WHERE ((CAST(r.startDate AS DateTime) >= 'firstDate' AND
CAST(r.startDate AS DateTime) <= 'lastDate') OR
(CAST(r.endDate AS DateTime) >= 'firstDate' AND
CAST(r.endDate AS DateTime) <= 'lastDate')) AND r.campID=cid AND rfc.siteID=sid
ORDER BY departure
After
SELECT r.reservid, r.reservedBy, r.stake, r.ward, r.groupType, r.startDate AS arrival, r.endDate AS departure,
(SELECT stakeName FROM Stake s WHERE s.stakeID = r.stake ) AS stakeName,
(SELECT wardName FROM Ward w WHERE w.wardID = r.ward) AS wardName
FROM Reservation r, ReservationForCampsite rfc
WHERE r.reservid = rfc.reservid AND
((r.startDate >= 'firstDate' AND r.startDate <= 'lastDate') OR (r.endDate >= 'firstDate' AND r.endDate <= 'lastDate'))
AND r.campID=cid AND rfc.siteID=sid
ORDER BY departure
To begin the optimization, we took out the casting and changed all the varchars to datetimes, which is what we think they should be in the first place. Again, we could not get the original query to execute, so we don't know exactly how much time was gained through this change. Having seen optimization in practice before, however, we feel confident this change speeds up the query. We then changed the ward field by first replacing "all" with null and then changing the field type to int to remove the need to cast w.wardid as a varchar (now null became a special value to mean "all").
We indexed reservation.ward, changed reservation.stake to int(11) from varchar, then indexed reservation.stake as well. In the reservationforcampsite, an association (linking) table, we created two indexes for the two fields, rfc.siteID and rfc.reserveID. The indexes contained both fields, each index in different order.
When the query was first run, before the indexing, it took .0392 seconds uncached and .0004 seconds cached. After the indexing the query took .0030 seconds uncached and .0004 seconds cached.
Query Optimization 3
Before
SELECT s.siteID, s.siteName, s.siteCapacity, g.groundName, r.reservID
FROM Campsite s
INNER JOIN Campground g ON (s.groundID=g.groundID)
INNER JOIN Camp c ON (g.campID=c.campID)
LEFT JOIN ReservationForCampsite r ON (r.reservID=" + editMode AND r.siteID=s.siteID)
WHERE c.campID=cid AND c.campID=g.campID AND g.groundID=s.groundID
ORDER BY g.groundName, s.siteName
After
SELECT s.siteID, s.siteName, s.siteCapacity, g.groundName, r.reservID
FROM Campsite s, Campground g, Camp c, ReservationForCampsite r
WHERE c.campID=cid AND c.campID=g.campID AND g.groundID=s.groundID AND r.siteID = s.siteID
ORDER BY g.groundName, s.siteName
Before we started on the optimization, we decided to remove the mysterious 'r.reservID=" + editMode AND' from the FROM clause. This would actually cause the query to break, as the lone double quote would completely throw it off. So, part of the optimizing sometimes involves getting things to work. After this small change, we ran the query, which completed in .0473 seconds uncached and .0003 cached.
The next step involved removing the inner joins from the FROM clause, as the data is joined in the WHERE clause anyway. This improvement alone brought the query execution time to .0024 seconds uncached, and .0003 cached.
Next, we created indexes on campground.campID, campsite.groundID. There were already indexes in the reservationforcampsite table, so we didn't need to create them. All these changes brought down the query execution time to .0016 uncached, and .0003 cached.
Database Schema Changes
The main changes to the schema we recommend are data type changes. The data types in the beginning database are often not ideal for the data they store. In addition to the datatype changes already made for the query optimization, the following datatype changes should be made:
- Reservationdate.reservdate from varchar to date
- Unavailabledate.unavailDate from varchar to date
- Reservation.startDate and endDate from varchar to date
Additionally, the following changes can be made to improve the schema:
- The tables priority schedule and nextpriorityschedule should be combined, because the data would not lose any integrity, and logically nextpriorityschedule is simply a continuation of priorityschedule. Basically, currently we have horizontal fragmentation of a table with 12 rows, and that's just cruel.
Changes vs. Improvements
Whether the changes to the database are actually beneficial needs to be carefully measured. Specifically, changes made need to increase the efficiency of the database as measured by the time queries take to complete. If a change to a table increases the speed of the queries operating on that table, it is a good indication that the changes are really improvements.
We have sampled the complex queries optimized above and they all yielded improvements. Even though we have not measured every single index and datatype change, we have no reason to suspect they would yield a different result.
Scriptures Database
Index Changes
Given the database schema and the queries, there are several unnecessary indexes. Indexes are meant to speed up queries, and they shouldn't exist if they don't serve that purpose, because indexes also slow down transactions during updates. This is the underlying motivation for the JODDB index changes.
The general recommendation is to remove all indexes on fields which are not in the WHERE clauses or computation functions of queries. For example, the 'speaker' table contains three indexes which are not tied to any WHERE clauses or computation functions of queries. Therefore, each of these indexes could be removed. Additionally, the "LastName_2" is an index of two columns, which can also be used as an index for just one of the columns, making the "LastName" index unnecessary. Without listing every single index, suffice it to say that these two general tests ((1)indexes used in WHERE clauses and (2) index redundancy) should be implemented across the whole database.
Recommendations
The recommendations above to delete unnecessary indexes would actually increase database performance for updates as well as decrease database size, both of which are part of optimization.
Tuesday, April 10, 2007
Google is at it again!
First, we'll start with the google web optimizer. This program lets you create two or more similar landing pages, and then tell Google a percentage of your traffic you want to send to each page. All of this traffic would be coming from your pay-per-clicks. Google then runs these tests until the results become statistically significant and displays all the information in simple reports to tell you the effectiveness of each landing page. All you need to do is create the similar pages and embed a little bit of javascript code on the page. I haven't tried it yet, but knowing Google, I'm sure it is very easy. This would be very easy to implement on Choiceskills.com, even faster than Site Catalyst.
Second, I recently received an email from Google, giving me $400 towards my Google radio ads account. How cool would it be to do a radio spot for portableheavydutyaircleaners.com? Something that would have never been in reach for small, small businesses, is now available to all.
Go Google!
Friday, April 6, 2007
TGIF
Thursday, April 5, 2007
Character Education Landing Page
Character Education Landing Page
As you can see, it's still a work in progress, but most of the information is there. I put it in all the meta tags and all the links in the main body are H1 tags, changed with CSS, to look like normal links. The links at the bottom are all good keywords that can be linked to various pages in the website.
Tuesday, April 3, 2007
Log4J vs. JMS
Log4J - is a java utility that allows a developer to turn logging off and on at runtime. Logging statements can be a resource consuming task and could take up thousands of lines of code. Logging can also slow down performance because of all the extra code. Log4J seeks to minimize the resource usage. Logging can be a very important tool, not only for error messaging but also for debugging purposes. Log4J uses inheritance to help control the message levels that are outputted.
JMS - This topic is a little more complicated than Log4J. As far as I can tell JMS is way for your java program to pass certain types of messages from one server to another.
- Publish-Subscribe - When a single message needs to be received by multiple applications. (Seems like this could also be accomplished through web services)
- Point-to-Point - When one application is sending a direct message to another application
- Request-Reply - When an application sends a message and is waiting for a message in return
JMS seems like more of a way to pass information from one application to another, much like web services. Log4J is used in one application for logging and debugging purposes. There are a lot of different versions of Log4J, including all of the major programming languages. On the other hand JMS is only for use with Java. I'm not exactly sure how Log4J and JMS will fit together. I'll have to wait for class tomorrow.
Choice Skills Banner

We've recently been working on improving a website in my web analytics class. We each made up mock banners that could go up on www.choiceskills.com. Here is my entry, though it wasn't as good as some of the other entries. This site is selling education materials for teaching people through character-based learning. If you know any teachers, pass them along to this website.