Analytic Techniques
	Raster GIS Tutorial (Earth Shelter) 
This tutorial serves several purposes: First, it demonstrates a methodology for turning
   questions and intentions regarding the real world into conceptual models; how the concepts in a model may be represented with data and functions in a data model; and how a data  model may serve as a laboratory for experimentation.  Most importantly, this process 
   will end with a mentality and a method for understanding and discussing the degree of confidence 
   we have in the model, and for evaluating the utility of a model, either as a means of generating
   useful information about alternative futures for a place, or at least as a means of 
   understanding something about the challenge and potential for making useful models of the data-world that help us to know useful things about the real world.
This slideshow is currently under construction!
On a secondary level, this tutorial takes us through several layers of technology:
   An understanding of raster data structures  and the  
   functional components and grammar of map algebra, that transform and derive associations from raster 
   layers, cells, zones and neighborhoods -- to represent spatial relationships among 
   concepts.  Finally, we will look at some ways theat map algebra functions can be linked 
   together to make elaborate chains of logical operations that may be tweaked and re-run to 
   perform experiments for investigating alternative strategies for underswtanding or changing the 
   landscape.
Tutorial Dataset
References
 
Purpose and Question
In beginnning any exploration of models, it is important to have a clear statement of 
   purpose.  Without a clearly stated purpose it is pointless to try to evaluate anything.
Power in the Landscape: The Pilgrim Pueblos Project
The rising cost of fuel and the problems caused by burning fossil fuels and 
   nuclear power have led us to an increased appreciation for renewable energy, and
   in particular passive solar design.  After some research, we have determined that 
   some sites are better for passive solar homes than others.  We want to buy up 
   select sites around the country to build housing developments that will have the
   greatest passive solar potential.  We plan to create a system that will be national
   in scope.  Ew will employ a Mashup 
   similar to  HousingMaps.com, which will 
   automatically process real-estate parcels that come on the market.  We will take the 
   location of the parcel and use data of national extent to flag filter and flag parcels
   that have high earthshelter potential.  Flagged parcels will be examined more closely by
   a trained map interpreter befor being recommended by a site-visit and potential offer.
 We begin with a few simple criteria for establishing
   the potential of a site:
  Conceptual Model
  We seek sites having the following properties
- Building Sites must Have Sufficient Slopes to take advantage of 
   eath-sheltered design and passive  cooling in the summer.
 - Slopes should Be More or Less South-Facing  for best solar heat gain in the winter.
 - Should Have Forested Areas Upwind in order that trees can provide shade and slow down the winter winds 
 - Building sites that Are Directly Up-Slope from Drainage Features will not be rated so highly.
 - But building sites Having Potential Water Views will recieve higher marks.
 - Building sites that have High Accessibility to Commercial Areas will be flagged for special 
   scrutiny (for our more urbane customers)
 
 
Note that each of the terms highlighted in Bold in the conceptual model involves a term 
   of fact, e.g. Commercial Areas and a term of relationship, e.g. High Acessibility.
   Part of our task will be to find data to represent the static facts, and procedures to represent
   the relationships in theis model.  This arrangement of data and procedures will be our 
   Data Model  Our data model becomes a laboratory for experimentation when we alter aspects
   of the facts and the relationship and think logically about the impacts of various decisions.
Understanding of Models, and Choice of Errors
Part of our goal is to understand how well we can build a model that will work on a 
   national scale, using the concept of a mashup (see  HousingMaps.com.  This mashup would automatically find listings of real-estate for sale, and then would evaluate each of them for potential for one of our developments, and would flag potential properties for further on-site evaluation. This will requuire somewhat consistent data that is national in scope.
   But before implementing this national program, we need to develop a pilot test case using the best quality local data that we can find.  This calibration dataset will
  allow us to evaluate the sorts of error we may be facing when we try to buid a model with 
   coarser-grained national data.  
Closely examining both models in the same pilot area, will
   help us develop a level of confidence in the national model.  It is important to tink about
   two important types of error.  First is Errors of Omission, which would be a failure
   to identify a site in our data model as having potential, when on the ground, it actually 
   does.  We also can expect Errors of Comission, in which we identify sites in our
   data model as having potential, but investigation on the ground reveals that they don't
   in reality, meet our criteria. As it happens, in the implementation of the model, the decisons
   we make will lead to more or less of each of these types of errors.  Choosing which type 
   of error you would rather have is a key element in the design of data models!
Because our model is the first, automated, stage in a site evaluation process, we would 
  rather have a model that is biased toward errors of comission.  The second stage of 
  evaluating sites is to use a trained human being to evaluate sites that are recommended 
  by our model.  If our model fails 
  to flag sites that possibly have potential, then our analyst will not have as much work, but
  we may lose an opportunity to examine a site which may, in reality, have potential.  The feedback
   from the analyst about what sites are flagged, either rightly or wrongly, will help us to 
   fine-tune the model over time.
An Overview of Common Geoprocessing Tools
One of the goals of this tutorial is to provide a setting for the demonstration of several of the most common cartographic modeling tools.  This is by no means a complete list, but should cover most of what a student might need to make some very interesting models of their own.  The list below wil introduce the tools and provides a refeernce to where thr tool can be found in the geoprocesing toolbox.  
	It is a good idea to browse around in the toolbox to learn about the hundreds of other tools that you may find useful.  It can also be useful to search for tools using the Find Tools dialog under the Geoprocessing menu in ArcMap.
	
	
Outline
- An Introduction to Cells, Attributes, Layers and Zones in Discrete and Continuous rasters.
 - A simple extraction of information  dataset using a boolean operation.
 - Creating a re-useable model for evaluating earth-sheltering capability
  
 - Distance Tools Spatial Analyst/ Distance 
  
 - Surface Analysis Functions: Spatial Anslyst Tools / Surface
  
	  - Slope: How steep is the terrain at each location?)
	  
 - Aspect: (which direction are slopes facing?)
  
 
 - Redlassifying Rasters:  Spatial Analyst / Reclassify
	- Reclassify:  This tool is essential for assigning normalized values to rasters for use in overlay models
 
 - Focal Functions: Spatial Analyst / Neighborhood
   
   - Focal Statistics:  How much forest is nearby or upwind of each cell on the map?
   
 - Point Statistics"  How many gocery stores are within a distance of each cel lon the map?
   
 
 - Zonal Functions: Spatial Analyst / Zonal Functions
   
   - RegionGroup: Creates zones 
   
 - Evaluating a property parcel with a zonal function
   
 
 
A Careful Approach to Builfing Models
This tutorial is going to cover a procedure of chaining procedures together to transform the information we have into the information that we need to solve a specified problem.  As we proceed, some of our criteria will emerge and be adjusted after we have had a look looking at associations among our transformed layers.  At each step is is
   crucial to examine the outputs that are generated to make sure that they make logical sense
   based on an examination of the topographic map and aerial photograph.  It is very easy to make
   a mistake that will generate a ridiculous result.  THe absurdity of the result may be obvious
   when we look at it directly, however it will be very difficult to figure out that we have made a 
   mistake once we have combined the erroneous layer with other steps of our analysis.  This 
   step-by-step evaluation of our work wil lalso be a good time to think about which factors of
   data quality or decisions that we make will be critical factors leading to errors of 
   omission and comission.
Examine a Discrete Value Raster
Cartographic Modeling procedures make a lot of use of Raster Layers, that represent
  evenly-sapced, congruently-defined locations on the ground as Cells.  In a single
  layer, each cell is tagged by a Value, which may be used to disctiminate various different
  discrete types of locations known as Zones; or surfaces that vary from cell to 
  cell, in Continuous fashion.  The regular relationships among cells allow for many powerful 
  ways to create and use logical relationships among locations and their properties, as we 
  shall see.
The New ENgland Gap Vegetation (Gap_Veg) layer is a discrete value raster.  
  There are several ways to 
  evaluate the raster dataset.  For one thing, it has metadata that can be found
  in its folder (or 
  click here to see the metadata) But even without any metadata at all, we can learn
  a lot about this layer by examining its properties and its logical consistency with other
  layers.
References
MassGIS land cover layer, or the 
USGS 
  National Land Cover Dataset, the 
USGS Hydrography Layers, or the 
Massachusetts Department 
  of Environmental Protection wetlands layers, in terms of their fitness for identifying 
  potentially scenic water features.
Examine a Continuous Surface Raster and some Surface Functions
The Digital Elevation Model, DEM is an example of a Continuous Value raster.
Each cell value is an observation of the height of that cell. Obviously, no matter how
precise you want to get spatially, you could measure a different height.  So the precision
of this raster is an critical question to investigate.  Without knowing anything about the 
DEM, we can guess that fluctuations of elevation smaller than the cell size are not 
well represented.  The DEM provides, at best, a general idea of the elevation surface
in our study area.  Because of the regular arrangement
of cells, certain things like Slope and Aspect can be calculated by looking at the 
other cells in the 3x3 cell Neighborhood of the cell in question.  Naturally,
the precision of the new slope and aspect layers is a function of the precision of 
the DEM.  We may be able to get a sense of the utility of the slope and aspect layers 
by looking at the results on top of our familiar USGS Topographic map.
This demonstration will also introduce ad-hoc use of tools in the ArcGIS toolbox.
References
Demonstration
- Take a look at the USGS Digital Elevation Model.
 - Use the information tool to measure some heights.  Note that the values of the cells have a lot of 
   decimal places.  Looks precise, but...
 - Determine the Cell Size.
 - Look at the DEM on top of the USGS quad map.  Zoom into a landform on the topo map and 
  flick the DEM on and off.
 - It appears as if the DEM is als good a representation of terrain as the topo map.  Maybe better.
   This layer is definitely not adequate for planning the precise location of a building foundation, 
  but it may be useful in quickly
   locating areas of generally steep slope and generally south-facing aspect.
 - Find the Aspect Tool using the Search For Tools option in the geoprocessing menu.
 - Open the Aspect dialog and fill in the blanks to create a map of slope direction.
 - Admust the Display properties to make it 40% transparent.
 - Look at this layer over the USGS map abd shaded relief and judge whether it makes sense.
 - Do the same with a slope calculation.
 
 
Making a Model to Combine and Save and Strings of Data and Operations
With the Apsect operation we carried out in the previous demonstration,
  we have seen how a geoprocessing tool can be used in an ad-hoc fashion.  Tools take 
  input data, have settings that we can set to control how things
  are processed, and they produce output datasets.  You could imagine how the output of
  one tool may be the input of another, and how the settings of individual tools
  might be saved so that we can re-run a chain of procedures, altering some settings, without 
   having to set up several tools 
  again and again.  To do this we will learn how to save our Workflows as Geoprocessing 
  Models Like Map Documents, models can be set up with
  relative pathnames so that they can travel around with our data and be explored, re-run 
  and ajusted by our collaborators.
  References
Demonstration
- Create a new folder within the sample dataset folder to hold your tools, models and maps.  Name
  it with your unique user name.
 - Within your folder make a scratch folder to hold the intermediate results of your models
   and create a data folder to hold new data created by your models (data that you will 
   want to store
    persistenly after your model has finished. Create a Tools Folder 
 - Click  Geoprocessing->Options to allow models to Overwrite the results of Geoprocessing Operations 
 - Use the Environment button on the Geoprocessing Menu  to set the 
  
  General Settings 
    
    - workspace and scratch workspace settings to your own scratch folder.
    
 - the analysis extent to be the extent of the layer T_m_Clip
    
 
  - Raster Analysis setting for Analysis Cell Size to be 10 meters
  
 - Output Coordinate System: make it Massachusetss State Plane, NAD 83 Meters.
  
 
 
 - Right-click in the toolbox window to create a new toolbox for your models.
Right-click on this toolbox and check the properties to see that the new toolbox was created in 
your user folder.
 - Create a new Model in your toolbox, and open it for editing
 - Drag digital elevation model raster into your new geoprocessing model.
 - Find the slope tool and drag it into the model
 - Double-click the box for the slope tool and fill in the blanks.  Note that the Icon for the DEM
   layer in the model has a blue icon.  Note that the output data set is being placed
  into the scratch workspace folder.  We can give it a reasonable name.
 - Use File->Model Properties to set the model to use Relative Pathnames
 - Run the model
 - Right-Click on the slope output layer and choose Add to Display to see it on the map.
 
 
Exploring a Ready-Made Model and Reclass and Map Algebra Functions
The fact that models can be made in advance and re-used is a great advantage in teaching,
since I can prepare a model and demonstrate it in class without having to spend a lot of 
class time 
fiddling with a lot of settings.  For the next demonstration, we will examine a model that has already
been prepared and packaged with the sample dataset.  The next few demonstrations will discuss
parts of the Earth_Shelter model in the pbc_tool toolbox.  The first phase of this model
adds Reclass functions to convert the different scales of the slope and aspect 
maps into a common scale.  Then we will use a Map Algebra statement to calculate a new
layer whose cells values are calculated as a simple function of the cell values of two input maps. 
This combination of operations will produce a new raster that ranks each cell in the study area 
on a scale of -2 through +2 according to the compatibility of the terrain with our values for 
steep south facing
slopes.  
In this demonstration we reveal a special cell value, named NoData no data is
  a cell value that is used as blank space. In some functions, like Map Algebra
  The blank areas of NoData, prevent analysis from happening.
References
Demonstration
- Find the pbc_Earthshelter toolbox in your toolbox panel or add it.
 -  Right-click on the model named 0. Sample Model and choose Edit to see and alter the contents of this model.  Note that simply double-clicking on it, or choosing 
   open, only shows you the parameters of the model.
 - Take a look at the settings Reclass function for the Slope Map.  Note how all
   possible values for sloipe are calculated as values 2: Prefered, 1: Good, 0: Possibly OK, and NoData: Absolutely not.  representing our value with respect to choosing a site for 
   an earth sheltered house.  Values outside of our range of desirable slope and aspect are set to 
   NoData which means that no matter what the values may be in our other layers, cell locations
   that have a value of NoData in any layer will no be considered.  In this model, you can think of NoData as representing No-Build.
 - Take a look at the reclass function for the Aspect map. Note that areas that have north-facing slope are reclassed to NoData, and other values for aspect are assigned values in the dsame scale. 
 - Now lets look at the Raster Calculator function that brings the Slope Values
   and the Aspect Values together.  Think about what this expression is doing.  It creates a
  new raster layer and calculates the value of each cell to equal the sum of the cell values 
  of the reclassed Slope and Aspect rasters, divided by 2. 
 - Wherever one of the cells in either layer is NoData, the cell in the output raster is NoData.
 - We want to let Aspect carry double value relative to slope, so we will change this
   expression to ((2 * aspect_val) + slopeval) / 3.
 - Right-click the Single Output Map Algebra function and choose Run
 
 
Evaluation
It is important to take a break here and understand the new information we have created.
  It is very easy to make a little mistake in the settings of your functions, that can have
  very important consequenses in the model.  If you don't check each step, you can easily
  fold a bad mistake into several other steps, wherepon it will be very difficult to figure 
  out where the problem is (if you ever discover it al all.)  Aside from this, even if the 
  functions operated as expected, it is important to see whether the result makes logical sense 
  vis-a-vis other data.
- Open the new earth_val layer and give it a shadeset that includes a ramp of color from red to green.
   Make the layer 40% transparent.
 - Overlay this layer on top of the USGS Quad Maps.  Does it appear to make sense?
 - Use the Information tool to understand the values for each cell.  Do they jibe with 
   your understanding of what should have happened in the Map Algebra expression?
 - Examine the output of this map for logical consistency with the USGS Quad Maps.
 
 
Converting Vector Features to Raster
  Now that we have found places that have landform capability to support an earth sheltered 
  house we need to narrow our focus.  We are interested especially in land that is forested
   or that is already residential.  Later on, we may add more criteria, but this is a start. 
  This will lead us to explore two more important
  activities of cartographic modeling: conversion of vector features to rasters and a new way 
  of using the map algebra function. As with many of the datasets we have, the 
  MassGIS Land Use data set is a vector feature class, not a raster.  Since many cartographic 
  modeling functions require their inputs to be 
  rasters,  we must convert them through a process called Sampling.  When doing this, 
  we have to keep an eye on the cell size and what we are losing when we trade polygons with 
  vertices to cells. 
References
Demonstration
- Turn on the Land Use  layer from MassGIS.  
 - Which of the polygons on this map meet our criteria for land use?  
 - It is very important
    that you understand the data you are using and not make assumptions.  
 - Examine the attribute table and Read the metadata
 - In the Earthshelter model, open the Polgon to Raster function and examine
   its settings, especially the Cell Size and Field
 - Run the Polygon to Raster function.
 - Examine the raster result with regard to its original vector polygons.  Have we lost
   anything important?
 - Use the information tool to click around the new Land Use raster and observe the 
    values for various cells.  It is useful here to consider how the choice of cell size
    has resulted in the loss of information about the precise edge of the polygons.  Do you 
   think that the resulting error is going to be critical in terms of the overall utility 
   of the model?
 - Now run the reclass command to convert land use to land use value
 - Note that land use zones that are not assigned to a specific value, are converted to 0 since the existing land use of a site may enhance its desirability, but is not constitute a 
  deal-breaker in the site selection process.
 
 
Evaluation
Once the Sample Model has been run all the way through, we should take a look at the result to see
   if it makes sense, and reflects our values.  I will call your attention to an interesting 
   thing that happens with raster elevation models.  Use View->Bookmarks to go to 
   Great Hill, in acton.  Notice first that the model seems to make sense with regard to the aspect 
   and the distance from the stream.  But notice the horizontal syrations that are blanking out the 
   slopes along the hill.  If we examine the compoent layers, we can see that this apparent error
  of omission is a result of the results of the slope function.  Apparently the elebation model
   shows a horizontal step on the hillside, that is not reflected in the USGS contour map.  The fact that these striations are mostly east-west on our map arouses some suspicion about whether these
   may be artifacts of the process of creating elevation models.  This sort of error isn't 
   something that should cause us to doubt the entire model, but it his is something we 
  should keep in mind when thinking about whether our data are as good as they ought to be or
   whether we ought to look for a better elevation model.
Thinking about Relationships
Lets back up now and think formally about  Modeling 
   We have developed a concept of sites that are propitious for earthsheltered housing.
   Our conceptual model so far includes three concepts of fact:
  
     
     - Slopes steep enough for earthshelter
     
 - Aspect for southern exposure
     
 - Land Use Forested or existing Residential
     
 
Our model also includes a concept of Relationship:
- Juxtaposition of propitious factors occurring in the Same Location
 
You can see that this association among facts is simulated in the Map Algebra procedure
   in the model.  This procedure examines each location in a list of layers and produces a new layer
   whose cell values are each 0 a function of juxtaposed cells in the input layers.  According to 
   Tomlin's Map Algebra notation, this sort of association of cells is a Local Function. 
   Tomlin has several other classes of associative procedures in his Cartographic Modeling language:
   Incremental Functions,  Focal (Neighborhood) Functions and  Zonal Functions  YOu can check out these  illustrations to see how these work. 
References
  
Incremental Functions: Distance, Movement Across the Landscape
A Local relationship describes two facts corresponding to the same location.  Lets think 
   about another relationship we may want to model.  For example, in our investigation of
   earthshelter sites, we can see that there are many areas that our model designates as propitious
   for earthsheltered housing, which are actually on bluffs that are up-slope from drainage features
   such as streams or open water.  This is a problem not only of erosion but also in terms of the 
   long-term stability of the site, which may be involved in a landslide.  We would like our model to eliminate the worst of these and to rank other sites as lower in value if they are too near 
  water bodies.
   so that we don't have to waste our time and money examining these sites that are more than likely unbuildable.  So we have a new
   concept of fact: drainage feature, and a new concept of relationship: Proximal To There are 
   several ways we could model this relationship.  The simplest is the concept of Euclidean Distance. 
   This function takes a source layer, which defines an area of data as the area to be measured
   from, and an are to be measured into.  The source layer may be a raster or it may be vector
  features.  In the case of the Source Layer, the cells we intend to measure From can have
   any value, but the cells in the area we intend to measure Into need to 
  have a value of No Data.  The incremental functions and produces an output that measures, 
  incrementally, some function of distance, or accumulated cost of traversing the landscape to get
   to the nearest source cell. 
Demonstration
In class we will demonstrate the euclidean distance function in the Sample Model.  THis 
  application uses Euclidean Distance, using the Linear Hydrologic features as the source layer.
  The distance function produces an output layer whose cell values represent the distance 
  to the nearest source cell on the source layer.  This new laer ten represents the relationship
   Near Hydrographic Features in this sense it is like the buffer functions that are commonly
   used in water prodetction ordinances in many cities.  It is easy to measure on a map 
   (even without GIS), but does not take slope or ground cover in to account.  We will look at a 
   better way of doing this later in this tutorial.  In our model, we translate this distance
   according to our purpose for evaluating sites for Earthsheltered Housing.  Sites withn 50 meters
   of a stream will be considered no-build.  Sites between 50 and 100 meters will be considered, all
   other factors being equal, sites farther than 100 meters aay from hydrographic features are 
   preferred.
 
Weighted Overlay: Putting it all Together
Note how the last step in our simple sample model uses the Single Output Map Algebra 
   computes the weighted average of all four factors.  
   
   ( slopeval + ( 2 * aspect_val)  + (0.5 * mgis_lu_val) + hy_dist_val) / 4.5
   
   We note here that this map algebra 
   function is a local function.  We can also see that the terms in the map algebra expression 
   are weigted, so that Aspect is weigted double with regard to slope.  The land use is given half
   a weight.  Distance from water and slope are not weighted, which is the same as giving them a 
   weight of one.  The weighted avrearge is computed by summing the terms and dividing by the sum 
   of the weights.  
References
Discussion
At this point, we should take a look at the results of the model to judge whether it
   makes sense, based on our visual inspection of the results on top of the topographic map
   and aerial photograph.  It is useful to keep in mind that we don;t expect this model to 
   be perfect, but it would be nice if it did as good a job as we feel is possible given the data
   and the procedural tools that we have.  In the next series of sections, we will explore a 
   few more sophisticated ways that we might approach this problem.
Modeling Complex Incremental, Focal and Zonal Relationships
The approcah we have taken so far in this tutorial has shown us the major workflow patterns 
   involved with cartographic modeling: Transforming data into represenataions of our own 
   values; modeling local and incremental relationships, and also some simple terrain analysis.
   The next section will elaborate on our model a bit more, to showcase some fancier incremental
   functions for taking into account areas having variable resistance.  We will also look at 
   visibility functions, which is another sort of incremental relationship.  THere are two more
   major types of relationships that are part of the cartographic modeling toolkit: Focal 
  Functions, which consider the relationship of a cell with the facts represented in a data layer -- 
  within a specified neighborhood, and Zonal functions which take measurements of phenomena within
  areas defined by the zones in a raster or the polygons in a feature-class.
  Since our model is going to get a bit more complicated, we will break it down into 
   separate sub-models, which will each yield a value raster which summarizes the local 
   pluses and minuses for each cell within our study area with special regard to the 
   development of earth-sheltered housing.  These will then be given the weighted average treatment
   in the very last phase.
 
  
Cost-Weighted Distance
References
Model Number 2 uses a distance function to look at the proximity of commercial centers.  
   We value good access to commercial centers.  To demonstrate this sort of distance function,
   we will consider accessibility by car, since this is a common thing to do.  We will leave it
   as an exercise for you to figure out how to convert this to a model for pedestrian and cycling
   accessibility.  Our model converts the road class attribute to a value of cost for crossing 
   a cell.  Because the reclass function can only yiels an integer raster, our first transformation 
   yields values of Minutes per Kilometer.  Because people do not move only on roads, we reclass \      the value of the land 
   between the roads from NoData, to 10. This simulates the cost of traveling overland to get from
   the road to your destination.  The cost units, required by the cost distance function
   are expected to be experesed in terms of the distance units of the 
   input layers or the geoprocessing environment.  This is why we use the map algebra function to 
   divide our cost units by 1000, converting them to minutes per meter.  Now when we use this cost 
   layer in the costdistance function, the relationship that is represented in our output for each cell, will be the cost of getting to the nearest commercial area (in minutes.) 
   Model 2a 
   makes this accessiibility model more realistic, by stipulating that water cells are not 
   so easily crossed as land cells.  This model introduces a couple of important functions
   that are often necessary when transforming raster datasets:
- Merge function effectively overlays a list of rasters, and allows rasters 
   listed first to supercede rasters listed subseqently.  In the merge function cells having
   a value of NoData are treated as transparent, according to this overlay analogy.
 - Conditional Functions:  These allow you to create a new raster whose cell values
    are defined conditionally based on an evaluation of an expression.  
 
The merge function lets us overlay the road values over the stream values, and the 
   conditional function SetNull allow us to turn the set the wet cells to a value
   of NoData  which will effectively dictate that the only way to cross
   a hydrographic feature is where a bridge already exists.
Incremental Relationships where Slope is a Factor
Model 3 provides a better way of modeling the relationship of water features to 
   potential building sites.  This model uses 
   the Pathdistance tool, which wil yield the distance across the landscale (considering that travel
   on a slope accumulates more distance than travel on a level surface.  Path Distance also will
   consider that some types of land cost more to cross than others, due to vegetation or roads, for example.
   Finally, we can adjust the Vertical Factor Properties to restrict the measuermentr so that it will only 
  travel up-slope -- which when runnof is concerned, is the sort of relationship we want to measure.
 These distance measuring tools provide a means of representing a relationship of proximity or accessibility.
   For example, now that we have a means of measuring te relationship between water body and potential sites
   upslope, we can use this distanncd layer in our evaluation of sites.
Note that this model does a couple of new things:  First, we transform land-use to runoff resistence using a relational join to a lookup table.  Befoe we do this, we make a new copy of the land use 
layer.  Adding the join to a copy of a layer is a good idea, since we may want to run this model more than once, and adding the join to the new layer guarantees that the model won't encounter the situation where the join already exists. -- which would cause the model to have an error.
ANother feature of this model is that we have pulled the cell-size environment variable out of the path distance procedure.  Because this process is fairly time-consuming, enlarging the cell size
  from 10 to 30 reduces the number of cells that need to be processed by 9 times.  Once we have this model working the way we want it, we can poke this cellsize back down to 10 and un the model when we are at home in bed.
References
Focal Functions: What is in My Neighborhood?
Sometimes a quality of a place is a function of qualities of the areas surrounding it.
   For example, a good location for a farm house may be related to the amount of productive 
   land within a circular area of 500 meters in radius around the location in question.  
   Focal functions (also known as neighborhood analysis) let us define the geometry of a 
   neighborhood, and then will summarize the values on a data layer (e.g. productive land) 
   within that neighborhood, as it is centered on and evaluated for every cell on the map.  The value
   pf each cell in the output grid is the statistical summary (sum, average, variety...) 
   of the values within that cells neighborhood.
References
Measuring the amount of trees upwind
Our simple model favors sites that are located in the forest.  Part of this idea is that the 
   trees will slow down the winter wind that comes, predominantly from the Northwest.  In our more
   complex model we want to be more precise about the relationship between our site and trees.
   It is not necessary to be IN the forest to get the value of trees as a windbreak.  The critical 
   relationship is that a site should have Trees Upwind in the wintertime when the winds are 
   predominantly from the northwest.  Our Trees Upwind model creates a wedge-shaped analysis neighborhood, which can be used to represent the Upwind relationship for each cell.  This neighborhood can be moved over a forest raster to count the number of forested cells upwind.
Potential Views of Water
One of the more interesting incremental functions is Viewshed which uses 
   a layer defining the source (which is known as the observer locations, though may just as 
   easily be thought of as the observed!)  An elevation raster establishes the barriers
   to visibility.  Like the distance functions, the Viewshed function attempts to spread from the
   observer points, across the elevation model, until it encounters a barrier.  In our 
   model of Water Views, we begin by establishing observer points in the water.  For this,
   the Random Points function is very handy.  Then we simply calculate the viewshed 
   from these points.  Note that although the output of the viewshed tool is portrayed with only two 
   values: Visible and Invisible, if you change the symology of this raster, and use the get-info tool, you can see that the value of each cell varies -- it is a measure of the number of observers tha can see a given cell.  Note that in this model, we set the raster analysis environment of the viewshed
   tool so that it uses a cellsize of 50 meters.  Using the default cell size fo rthis project (10) 
   meters, makes this procedure take 25 times longer.  After we get comfortable with the way this is working, we may adjust this cell size down.  This will be particularly important when we add buildings to the model.
Our second version of this model adds buildings to the terrain model and adds offset values 
   to the observer points to simulate that the cells on land  (where we expect our earthshelter 
   residents to be looking from) sould be considered to be offset 2 meters, to simulate the height
   of a person.  
References
Parsing  Ridges, Hilltops, Valleys, and Hillsides
Take a look at the feng shui model in my latest sample earthshelter dataset.  It uses two focal mean operations to find ridges, valleys and mid-slope areas.  You need to tune it to find the sorts of areas you are looking for.  It is tuned by playing with the radiuses of the larger and smaller focal mean operations.  (Actually they should be 50 and 100 to start - there is an error in model in the on-line zip file.  This model basically makes two different smoothed elevation models and subtracts them to create a difference raster where the wrinkles are (+) and creases are (-).  You also tune it in the final reclass of the difference raster.  IN our case, you would want to give valley areas and hilltops a lower score and valley walls something else, etc,.  Actually, I have left this factor out opf our model, since I thnk that our slope criteria serve well elugh, but I have left the sample in the toolbox, just because it is a useful demonstration of Focalstatistics.
References
Summarizing Earthshelter Potential
The Site Score model simply takes a weighted average of the scores for each of
   the concepts in our model.  As you can see, there are a couple of fairly large potential 
   sites.  The spread of values is not as great as it could be.  In order to get a broader differentiation, we could start to fine-tune our model to make sure that there are more cells that are 
  awarded 2 or -2 in the component models.  We could also try different weights on the various
   components in our weighted average function.  But for now, we will just proceed.
Zonal Functions
So far, we have looked at lots of different sorts of relationships -- those based on 
  the cell location, others focused on cell neighborhood (focal functions), relationships based on proximity or accessibility (incremental functions).  The last sort of function we will look at in 
  this model is based on the coverage and geometry of zones.  If you look at the attribute table for
  our score layer, you will see that all the cells are divided into just three zones.  We can look 
  at the Count field to get a sense of how many cells in our study area have been assigned to 
  each category.  This is interesting, and a good reminder of what zones are in the raster context.
  Now one thing that we need to consider is that we require a fairly large site for oue pilgrim pueblo developments.  We are looking to site at least 50 houses.  And so it would be helpful to be
  able to evaluate each clump of contiguous cells as a potential site.  
The Large Sites model uses the Regiongroup function to create a new zone for each 
  clump of contiguous cells in our site-score layer.  The model then reclasses this layer 
  based on the cell count, to make a mask that will weed out potential sites that do not have 
   at least 50 cells.
Now, lets pretend we have some parcels that have come been reported to be on the market,
   and we want to evaluate them based on how many high-quality cells of earth-sheltered 
   housing.  If you look at the Parcel-Score model, you will see that this uses
  the parcels as zones, and calculates summary statistics for each parcel, based on the 
   values of the cells on our large_sites raster.  The output of this function is a table, which
   can be joined back to the parcel table, to evaluate every parcel several statistics that 
  will help us to evaluate whether the particular parcel merits further study.
   
References
Model Evaluation
Is our model a good one or a bad one?  By now, you know that this is the wrong 
  question.  We should ask instead:
  - Is our model useful for our purposes? We will get to this...  
  
 - What are the weakest inputs and procedures in the model?  
  
 - In what way would we expect the model to be biased -- toward
   errors of Omission or errors of Comission?  
  
 - Are there important considerations that are ignored by our model?
  
 - How could the model be improved?
 
This model seems to do a good a job of evaluating sites for building earthsheltered houses.  
At least as well as I would be able to do by studying the USGS quad map -- and much faster -- 
   paricularly since our plan is to scan and evaluate  
  all of the real estate offerings in the united states.  IN terms of our pilot study area, 
  I'm sure that the method has identified many areas as potentially good which may not actually 
   be good.   The model has also probably missed some spots that would be 
  great for an earthsheltered house.  The weak point here is the coarseness of the 
  terrain model (10 meters), a problem that is compounded in the slope and aspect maps
  which can only be seen as averages over a 3x3 cell neighborhood (90 meters) at best.  This fact 
  would undoubtedly
  miss a lot of slopes that might make a fine backdrop for a house.  
  Better terrain data would no doubt make this model better in finding smaller nooks
  for a house (and perhaps smaller is better?)  But no matter how fine the terrain model, 
  it would be difficult to model all of the site conditions, especially the ones that can potentially
  be created.  One can always regrade a site to an extent.
<\p>
  Of course I would actually spend a lot of time walking around in the study area before I 
  actually decided on a house site.  I think that this analysis is
  a useful aid in narrowing down areas that I would visit.  By finding many of the more obvious 
  spots in a fairly easy way, it gaves me more time to study the finer effects that may make a place
  more or less fit -- and this activity would probably give me more insight to fine-tune the model 
  or add new data (soil type, access to roads) that would be useful.  This knowledge may also
  help to learn a general method or rating criteria for establishing the thermal propensity
  of a site.