Migrating from Julia 0.4 to Julia 1.6

This process references release v98 (March, 2022).
This process requires that you have access to a DSWB container capable of running Julia 1.6 notebooks.

Starting notes

  1. The DSWB module pulls in and re-exports the most commonly required submodules:

    • BeaconAnalysis, DSWBSessionAnalysis, DSWBErrorAnalysis, SessionSummarizer, ResourceAnalysis
    • DBConstants, DSWBUtilities, DSWBConnecting, mPulseAPI
    • CommonSQL, SQLFilters, TableDefinitions, mPulseConstants
    • DBI, DBITypes, DataFrames
    • C3Viz, DataFrameUtilities
    • CSV, FileIO
    • DistributionStats, DataStructures (symbols must be qualified), Statistics, StatsBase, StatsFuns
    • Dates, TimeZones
    • Formatting, Logging, UUIDs
    • HTTP (symbols must be qualified), JSON, CurlHelper, LightXML
    • IJulia
    • Statistics, StatsBase, StatsFuns
  2. You can use the Pkg module to find details about available modules:

using Pkg
Pkg.status()
  1. Use the VERSION variable to make sure you're using julia 1.6
  2. These instructions are subject to change in future releases (see the date at the start of the doc).
  3. Julia 1.6 functions are now fully documented at the regular documentation link.
  4. We no longer have a DSWBCharting module, all charting functions are now defined alongside the matching get* function.
  5. We no longer provide the IJuliaCompat module (which gets you readprompt ). Add using IJulia followed by IJulia.readprompt to get the same functionality.

Notable changes in v98.

This is a short list of the most notable changes between v97 and v98.

  • Add Julia 1.6 support
  • Add support for LISTAGG to SessionSummarizer
  • Add percentile support to Resource Treemap
  • Allow getTopDimensionsBySession to work with dimension expressions as well
  • SessionSummarizer more aggressively tries to avoid Null references and divide by zero errors
  • Handle numeric database values larger than typemax(Int64)
  • New :COUNT_DISTINCT aggregate parameter
  • Add Table constants for the Error Table (ERR_TABLE)
  • Add color domain and range support to treemaps

Syntax differences

  1. type is now called struct and is immutable by default, use mutable struct to make it mutable.
  2. contains and ismatch are replaced by occursin.
  3. ucfirst and lcfirst are replaced by uppercasefirst and lowercasefirst.
  4. replace takes 2 parameters instead of 3. The second parameter is now a Pair of what used to be the 2nd and 3rd parameter, so change replace(str, pat, rep) to replace(str, pat => rep). To replace n occurences, use the count=n keyword argument.
  5. find* now returns nothing instead of 0 if nothing is found. This can be problematic if you then use if findindex > 0, instead use if findindex != nothing.
  6. fieldnames no longer works on objects, its argument MUST be a data type, so use fieldnames(typeof(obj)) instead of fieldnames(obj).
  7. The obj.(sym) syntax doesn't appear to work to dereference an object property using a symbol, but you can still use getfield(obj, sym).
  8. The datatype Void is now called Nothing.
  9. Template functions and types were previously written as function xyz {T <: Foo}(x::T) should now be written as function xyz(x::T) where {T <: Foo}.
  10. You MUST import a function from another module if you want to add to its methods in your module.
  11. Logging functions are now macros, so use @warn, @info, etc. instead of warn, info. Additionally, the err logging function is now the @error macro.
  12. To test warnings, debug, info, etc., use @test_logs (:warn, "Warn message") <expression>.
  13. reduce, mapreduce, mapfoldl, and mapfoldr now take the initial array state as a keyword argument called init.
  14. The sort and sort! methods on dataframes take cols as the second argument and no longer as a keyword argument, so change sort(df, cols=[x, y, z]) to sort(df, [x, y, z]).
  15. Change eltypes(df) to eltype.(eachcol(df)).
  16. Dataframes can no longer be addressed only by column. Specify ! as the row to get all rows for a column: df[!, colname], or alternately, use df.colname if colname is a valid julia identifier.
  17. Dataframe columns may be addressed as symbols, strings, or indices.
  18. The names function applied to a dataframe will return column names as strings, so when searching for a column name this way, use a string rather than a Symbol.
  19. shift! is now called popfirst! and unshift! is called pushfirst!.
  20. Database NULLs are no longer stored as NA. Julia 1.0 uses missing to represent NULLs. All code using isna needs to change to ismissing, additionally, ismissing does not operate on arrays, so the function needs to be broadcast using ismissing.().
  21. round() no longer works on arrays, use the broadcast operator round.() instead.
  22. To round() to a number of decimal places, specify the number with the keyword argument digits.
  23. In filter(f, ::Dict), f is now passed a single pair instead of two arguments.
  24. The second argument to the invoke function used to be a tuple() object but is now a Tuple{} type.
  25. tic() and toq() have been removed. Instead use time_ns() for nanoseconds and time() for second resolution.
  26. names!(df, [col names]) is deprecated, use rename! instead.
  27. isreadable no longer works on filenames, use isfile instead.
  28. You cannot use join to join DataFrames, instead use one of semijoin, leftjoin, antijoin, rightjoin, outerjoin, innerjoin, crossjoin.
  29. findin no longer exists, use indexin instead. Note that indexin works differently, and returns an array containing nothing for items that don't exist.
  30. The by method for DataFrames has been replaced by a combination of combine and groupby.
  31. Binary operators that previously worked on arrays may need to be broadcast. For example, the & and | operators on BitArrays now need to be .& and .|.
    Similarly when assigning a scalar to an array, use the .= operator rather than =.
  32. indmin and indmax no longer exist, use findmin and findmax instead. These return a tuple, and the index is the second element.
  33. STDIN, STDOUT, and STDERR are now stdin, stdout, and stderr, so instead of calling flush(STDOUT), use flush(stdout) instead.
  34. The datatype of DataFrame columns returned by any of our database calls will be Union{Missing, <data type>}. In order to check what <data type> is, use the nonmissingtype function:
    if nonmissingtype(df.column) == AbstractString.
  35. map is no longer defined on Dicts. Use collect to transform a Dict into a Vector of Pairs and either use map, or a list comprehension on the result.
  36. The keep keyword argument of split is now called keepempty.
  37. readall has been replaced with read. Additionally, to read a file as a string, pass String as the second argument: read("filename", String).
  38. When searching for characters in a String using findfirst, findnext, findprev, and findlast, the arguments are swapped. The needle (Char) comes first, and haystack (String) is second.
  39. The TimeZone module no longer supports deprecated timezones by default, so for example TimeZone("Asia/Calcutta") will throw an exception. Use TimeZone("Asia/Calcutta", TimeZones.Class(:ALL)) to allow deprecated timezone names.
  40. writetable no longer works to save a DataFrame object to a csv file. Use CSV.write("sample.csv", sample_df) instead.

Other notes on usage

Note that some coding styles were deprecated in earlier versions with 0.4 and these do not exist in 1.6. For example:

  • Instead of using the string "BEACON_TABLE", use the constant BEACONS_TABLE. This should be done in both 0.4 and 1.6
  • Instead of using this construct:
if !isa(QueryAPI.gLatestCon, DBHandles.NullHandle)
    QueryAPI.gLatestCon.conn.disconnect!() 
end

Call disconnect!() instead.

  • padDateTime now has to be qualified as C3Viz.padDateTime.
  • SQL generators that used to be in DBUtilities (eg: DBUtilities.isCached) are now in CommonSQL.
  • The replaceNA! and replaceNaN! functions aren't strictly required any more as Julia now has a built in replace! method for DataFrame columns. Any of the following styles will work:
    • replace!(df.numeric_colname, missing => 0.0)
    • replace!(df[!, :numeric_colname], NaN => 0.0)
    • replace!(df[!, "string colname"], missing => "")
    • You can even pass in multiple replacements for the same column in one call:
      replace!(df.numeric_colname, missing => 0.0, NaN => 0.0)
  • replaceNA! has been changed to replacemissing! The main advantage of replacemissing! over replace! is that it can do data type promotion of columns. This is useful, for example, if you have a column that has only missing values. In that case the column's type is Missing and won't accept a value of a different type. replacemissing! changes the type to allow for a composite value.
  • By default many functions will print out Info messages showing query execution time. To suppress these messages, add the following to the top of your notebook:
    Logging.disable_logging(Logging.Info)
  • When referencing all rows in a dataframe column with a name that does not need to be quoted (for example, green_column), you can use either df[!,:green_column] OR df.green_column.

Supported DSWB Modules/Functions

DSWB

BeaconAnalysis

  • getAggregateMetricByDimension
  • getBeaconCount
  • getBeaconsFirstAndLast
  • getGroupPercentages
  • getTopLandingPages
  • getTopReferrers
  • chartTopN
  • chartLoadTimes
  • chartBeaconTreemap

DSWBSessionAnalysis

  1. basic
  • getMedianSessionDuration,
  • getMedianSessionLength,
  • getSessionDistributionByDay,
  • getSessionDistributionByDayAndTopDimensions,
  • getSessionDurationQuantilesByDatepart,
  • chartSessionDurationQuantilesByDatepart
  • getTopPageGroupsByExitRate,
  • getTopDimensionsBySession,
  • getWeightedSessionDimensions,
  • getConcurrentSessionsAndBeaconsByDatepart,
  • chartConcurrentSessionsAndBeaconsOverTime
  1. bounce related
  • getBounceRate,
  • getBounceRateByLandingPage,
  • chartBounceRateByDimension,
  • getBounceRateByDatepart,
  • chartBounceRateByDatepart,
  • getBounceRateByLoadTime,
  • chartBouncesVsLoadTimes,
  • getLoadTimeAggregateAndBounceRates,
  • chartLoadTimeMediansAndBounceRatesByPageGroup,
  • getRetentionRateByDimension
  1. conversion related
  • getConversionRate,
  • chartConversionRateByDimension,
  • getConversionRatesByTimerAvg,
  • chartConversionsVsLoadTimes,
  • getPeakConversionRate,
  • getPeakConversions,
  • getExternalReferrersConversionsAndLoadTimes,
  • chartExternalReferrerSummary,
  • getPageGroupsAndSessionsWithConversionImpact,
  • getTopGroupsByConversionImpact,
  • chartConversionImpactByPageGroup,
  • getTopGroupsByActivityImpact,
  • chartActivityImpactByPageGroup

SessionSummarizer

  • getSessionDistribution,
  • getActiveSessions,
  • getSessionSummary

DSWBErrorAnalysis

  • getNetworkErrorRateByDimension
  • getErrorCount
  • getAvgErrorCount
  • ErrorMessageCluster
  • getNormalizedErrorMessage
  • getClusterNoisePts
  • addNewErrorsToGroups!
  • getRemainingPagesDistribution

ResourceAnalysis

  • getResources
  • chartResources
  • getResourceTrend
  • chartResourceTrend
  • getTreemapResources
  • chartTreemapResources
  • getLoadTimeStats
  • chartLoadTimeStats
  • getResourceServerStats
  • chartResourceServerStats
  • getRISResources
  • getRISVendors
  • getLoadTimeDistribution
  • chartLoadTimeDistribution
  • getResourceCount

DSWBUtilities

  • datetime2ms,
  • ms2datetime,
  • NormalizedTimeType,
  • timestampToZonedDateTime,
  • DATE_FORMATS,
  • DATE_FORMATS_TZ,
  • getDateRange,
  • getArgs,
  • getArgTypes,
  • getKWArgs,
  • @soasta_time,
  • isdatepart,
  • extractDatepart,
  • splitDatepart,
  • isnan_proxy,
  • mapGeoColumns,
  • parseBucketParameters,
  • plugTimeHoles!,
  • sendToSlack (new support for rich blocks),
  • sendmail (replaces sendSOASTAEmail),
  • decodeBoomerangCompressedLog,
  • decodeBoomerangCompressedTimeline

DSWBConnecting

  • getConnection,
  • getTenantAndAppName,
  • setConversionMetric,
  • setEndpoint,
  • setEndpointAndTable,
  • setEndpointForTenantId,
  • setEndpointFromMPulse,
  • setSnowflakeEndpoint,
  • setTable,
  • setTableForDomainId,
  • disconnect!

C3Viz

  • drawC3Viz,
  • drawBoxPlots,
  • drawTree,
  • displayTitle,
  • C3Viz.padDateTime

mPulseConstants

  • mPulseConstants.mP2DSWB
  • mPulseConstants.getFriendlyTimerName
  • mPulseConstants.getTimerColor

DBConstants

  • DBConstants.getPARAM()
  • DBConstants.setPARAM!()
  • DBConstants.getAD_META()
  • DBConstants.setAD_META!()
  • TIMER_TYPE,
  • DATEPART_DISPLAY,
  • DATEPART_TO_JULIA,
  • BASE_SUPPORTED_DATEPARTS,
  • BASE_SUPPORTED_WINDOW_FUNCTIONS,
  • BASE_EXTENDED_WINDOW_FUNCTIONS,
  • BASE_SUPPORTED_AGGREGATE_FUNCTIONS,
  • UnsupportedWindowFunction,
  • BEACON_TYPE_PAGE_VIEWS,
  • BEACON_TYPE_XHR,
  • BEACON_TYPE_SPA,
  • BEACON_TYPE_API,
  • BEACON_TYPE_ERROR,
  • BEACON_TYPE_USER_INTERACTIONS,
  • BEACON_TABLE_PREFIX_SF,
  • BEACON_TABLE_NAME_ASGARD,
  • RT_TABLE_NAME_ASGARD,
  • SESSION_TABLE_NAME_ASGARD,
  • ERROR_TABLE_NAME_ASGARD,
  • DSN_PREFIX,
  • PARAMS,
  • DSN,
  • CONNECTION,
  • DOMAIN_OBJ,
  • TENANT_OBJ,
  • BEACONS_TABLE,
  • RESOURCE_TABLE,
  • SESSIONS_TABLE,
  • ERROR_TABLE,
  • APPID,
  • CUSTOM_METRICS,
  • CONV_METRIC,
  • SESSION_TIMEOUT,
  • MPULSE_AUTH_TOKEN,
  • MPULSE_TIMER_BUCKET_DEFAULT_EDGES,
  • REQUEST_HASH,
  • MP_OBJECT_ENDPOINT,
  • DSWB_SESSION_ID,
  • AD_META,
  • SOASTALIB,
  • TEMPLATE_DIR,
  • COLORS,
  • BASE64_CHARS

DBI

  • DBI.Connection
  • execute
  • select
  • disconnect!
  • isClosed
  • cursor
  • commit
  • rollback
  • setmeta
  • printDSN
  • listDSNs
  • ODBC_SECTIONS
  • setVerbose

DataFrameUtilities

  • beautifyDF
  • injectDefaultTableStyling
  • replacemissing
  • replacemissing!
  • replaceNaN
  • replaceNaN!

DistributionStats

  • percentile
  • median
  • gstddev
  • c df_to_pdf
  • pdf_to_cdf

CommonSQL

  • truncateSQLTimestamp,
  • assetTypeGuessSQL,
  • contentTypeGuessSQL,
  • isCached
  • CommonSQL.filter2Snowflake
  • CommonSQL.filter2Asgard
  • CommonSQL.filter2String
  • translateFilters,
  • CommonSQL.applyAggregate
  • CommonSQL.parse_url
  • CommonSQL.IF
  • CommonSQL.SQL_ARRAY
  • CommonSQL.ARRAY_JOIN
  • CommonSQL.REGEXP_SUBSTR/REGEXP_EXTRACT
  • CommonSQL.MEDIAN
  • CommonSQL.PERCENTILE
  • CommonSQL.FIRST_VALUE
  • CommonSQL.LAST_VALUE
  • CommonSQL.LISTAGG
  • CommonSQL.extractGroupDimensions
  • CommonSQL.generateAggregateExpressions
  • fixTableName,
  • RT_HOST_EXPRESSION,
  • CommonSQL.hostToDomain
  • CommonSQL.hasTao

SQLFilters

  • SQLFilter,
  • SQLFilterUnion,
  • combineFilters,
  • escapeSingleQuote,
  • unescapeSingleQuote,
  • isNull,
  • isNotNull,
  • like,
  • notlike,
  • ilike,
  • notilike,
  • rlike,
  • notrlike,
  • eq,
  • noteq,
  • leq,
  • geq,
  • lt,
  • gt,
  • between,
  • notin,
  • isin

AggregateWaterfall

  • getAggregateWaterfallDistribution
  • chartAggregateWaterfall

WhatIf

  • WhatIf.getWhatIfAnalysis

WhatChanged

  • WhatChanged.whatElseChanged

AnomalyDetection

  • AnomalyDetection.chartAlerts
  • AnomalyDetection.chartAnomalies
  • AnomalyDetection.chartDynamicAlert

IQR

KNNICAD

NAB

DataStories

  • DataStories.getResourceImpactData